PubNub incident
Elevated History Error/Latency in US West Region
PubNub experienced a minor incident on December 9, 2024 affecting North America Points of Presence and Storage and Playback Service, lasting 1h 18m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Dec 09, 2024, 08:42 AM UTC
Around 07:23 UTC we began to notice increasing errors and latency for History in SJC region(US West).
- identified Dec 09, 2024, 08:55 AM UTC
The issue has been identified and a fix is being implemented.
- monitoring Dec 09, 2024, 09:28 AM UTC
A fix has been implemented and we are monitoring the results.
- resolved Dec 09, 2024, 10:00 AM UTC
This incident has been resolved with no errors observed for the last 30 minutes. We apologize for any impact this may have had on your service. Don't hesitate to contact us by reaching out to PubNub Support ([email protected]) if you wish to discuss the impact on your service. An RCA will be provided soon.
- postmortem Dec 16, 2024, 11:16 PM UTC
### **Problem Description, Impact, and Resolution** At 07:23 UTC on December 9th, we received alerts indicating high error levels related to storage writer operations in one of our data centers. Shortly after, one of our third-party service providers reported service disruption in their environment. The service provider began the process of replacing the affected nodes. Throughout the restoration process, we closely monitored our systems to assess how the issue impacted our environment. Once all nodes were successfully restored, error levels returned to normal, and all associated alerts were resolved. ### **Mitigation Steps and Recommended Future Preventative Measures** We have worked with our vendor to ensure that nodes of this type will be on redundant infrastructure going forward, so that there is less exposure to this kind of incident.