PubNub incident

Some Services Across US East Are Suffering From Partial Outage

Major Resolved View vendor source →

PubNub experienced a major incident on June 13, 2023 affecting North America Points of Presence and Access Manager Service and 1 more component, lasting 3h 28m. The incident has been resolved; the full update timeline is below.

Started
Jun 13, 2023, 07:30 PM UTC
Resolved
Jun 13, 2023, 10:58 PM UTC
Duration
3h 28m
Detected by Pingoru
Jun 13, 2023, 07:30 PM UTC

Affected components

North America Points of PresenceAccess Manager ServiceApp Context Service

Update timeline

  1. identified Jun 13, 2023, 07:53 PM UTC

    This is particularly effecting Authorization and downstream Objects.

  2. identified Jun 13, 2023, 08:28 PM UTC

    We have taken steps to mitigate this issue. Error rates and latency have be reduced.

  3. monitoring Jun 13, 2023, 08:59 PM UTC

    We are continuing to monitor the changes that we have implemented to mitigate this incident.

  4. resolved Jun 13, 2023, 10:58 PM UTC

    This incident has been resolved, all services fully restored.

  5. postmortem Jun 20, 2023, 01:31 PM UTC

    ### **Problem Description, Impact, and Resolution** At 19:20 UTC on Tuesday June 13th, 2023 we observed increased error rates and latency for our Authorization services at our US East facility. In response, we redirected authorization services from US East to US West, and the issue was mitigated at 20:59 UTC on Tuesday June 13th, 2023. During this time, we identified the root cause of the issue was due to a third-party service incident. After confirming the third-party service incident was resolved, we rerouted the Authorization traffic back to US East at 22:58 UTC on Tuesday, June 13th 2023. ### **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future we are developing a comprehensive failover plan to more quickly move services from one region to other regions.. In the next few weeks we will be implementing new processes to allow mitigation of regional service issues.