PubNub incident

Push Notifications Latency in Multiple PoPs

Major Resolved View vendor source →

PubNub experienced a major incident on June 13, 2024 affecting North America Points of Presence and European Points of Presence and 1 more component, lasting 3h 1m. The incident has been resolved; the full update timeline is below.

Started
Jun 13, 2024, 06:51 PM UTC
Resolved
Jun 13, 2024, 09:53 PM UTC
Duration
3h 1m
Detected by Pingoru
Jun 13, 2024, 06:51 PM UTC

Affected components

North America Points of PresenceEuropean Points of PresenceAsia Pacific Points of PresenceSouthern Asia Points of PresenceMobile Push Gateway

Update timeline

  1. investigating Jun 13, 2024, 06:51 PM UTC

    We have discovered an issue where push notifications are being delivered latently in our Frankfurt point-of-presence since the last 30 minutes. Our Engineering teams are actively investigating the issue and we will provide updates here. If you believe you have experienced production impact due to this issue and would like to discuss it, please report impact to [email protected].

  2. investigating Jun 13, 2024, 07:25 PM UTC

    We have discovered that push notifications in our Mumbai point-of-presence are also affected, with push notifications being delivered latently. Push notifications in our Frankfurt point-of-presence are also being delivered latently. We continue to investigate and will provide updates here.

  3. investigating Jun 13, 2024, 07:26 PM UTC

    We are continuing to investigate this issue.

  4. investigating Jun 13, 2024, 07:39 PM UTC

    We have discovered that push notifications in our US-East point-of-presence are also affected, with push notifications being delivered latently. We continue to investigate and will provide updates here.

  5. investigating Jun 13, 2024, 07:40 PM UTC

    We are continuing to investigate this issue.

  6. investigating Jun 13, 2024, 07:42 PM UTC

    We are continuing to investigate this issue.

  7. investigating Jun 13, 2024, 08:17 PM UTC

    Our Engineering teams continue actively investigating the issue. We will continue to provide updates here.

  8. investigating Jun 13, 2024, 08:49 PM UTC

    Our investigation continues and we will continue to provide updates here.

  9. investigating Jun 13, 2024, 08:55 PM UTC

    We continue investigating delayed push notifications in our Frankfurt point-of-presence. Push notifications are now being delivered normally in our other regions. We will continue to provide updates here.

  10. investigating Jun 13, 2024, 08:56 PM UTC

    We are continuing to investigate this issue.

  11. monitoring Jun 13, 2024, 09:21 PM UTC

    Push notifications are now being delivered normally. We are monitoring the system to ensure no further issues.

  12. resolved Jun 13, 2024, 09:53 PM UTC

    This incident has been resolved, and mobile push notifications continue to be delivered normally. We will follow up with a root cause analysis soon. We sincerely apologize for any impact on our customers and their users. If you believe you have experienced production impact due to this issue and would like to discuss it, please reach out to [email protected].

  13. postmortem Jul 01, 2024, 04:19 PM UTC

    **Problem Description, Impact, and Resolution** At 18:16 UTC on June 13, 2024 we observed increased latency for delivery of mobile push messages in our Frankfurt and US-East points of presence. In response, we increased the resources available to the services and redeployed the service.The issue was resolved at 21:21 UTC on June 13, 2024. Upon further investigation, we identified this issue occurred due to malformed message payloads creating a backlog in the message queue. ‌ ### **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future, we increased the memory for the service to handle similar malformed payloads, as well as added additional monitoring.