PubNub incident

Increased errors and latency for presence in FRA and BOM

Minor Resolved View vendor source →

PubNub experienced a minor incident on May 2, 2025, lasting —. The incident has been resolved; the full update timeline is below.

Started
May 02, 2025, 03:23 PM UTC
Resolved
May 02, 2025, 03:23 PM UTC
Duration
Detected by Pingoru
May 02, 2025, 03:23 PM UTC

Update timeline

  1. resolved May 02, 2025, 04:23 PM UTC

    Beginning at 15:40 UTC we detected increased errors and latency for the Presence service in the BOM and FRA regions. Our engineers investigated the issue and were able to restore the service which remains stable as of 15:52 UTC.

  2. postmortem May 08, 2025, 07:43 PM UTC

    ## **Problem Description, Impact, and Resolution** On May 2, 2025 at 15:40 UTC, we observed elevated errors and increased latency for customers using our Presence service in the Frankfurt \(FRA\) and Mumbai \(BOM\) regions. After observing the errors and latency, we increased the memory allocation and number of replicas for the affected services, and the issue was resolved on May 2, 2025 at 16:00 UTC. This issue occurred because we did not have adequate resource thresholds and alerting configured to proactively scale in response to a sudden spike in subscribe traffic, which led to resource exhaustion in key components of our Presence infrastructure. ## **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future, we have permanently increased resource limits and replicas for the Presence service in impacted regions. In the next week, we will also improve our alerting and monitoring to detect abnormal traffic patterns earlier and trigger automated scaling where possible.