Zulip incident

Delays in push notifications

Minor Resolved View vendor source →

Zulip experienced a minor incident on December 4, 2023 affecting Mobile Push Notification Service, lasting 6h 41m. The incident has been resolved; the full update timeline is below.

Started
Dec 04, 2023, 03:42 PM UTC
Resolved
Dec 04, 2023, 10:24 PM UTC
Duration
6h 41m
Detected by Pingoru
Dec 04, 2023, 03:42 PM UTC

Affected components

Mobile Push Notification Service

Update timeline

  1. identified Dec 04, 2023, 03:42 PM UTC

    We are currently experiencing a backlog in our push notifications service, which is causing up to 5 minute delays in devices receiving push notifications from Zulip Cloud. Push notifications from self-hosted Zulip servers using our push bouncer services are not currently affected.

  2. identified Dec 04, 2023, 04:30 PM UTC

    We are working on a fix for the issue. Notifications are now backlogged by 10 minutes.

  3. monitoring Dec 04, 2023, 06:20 PM UTC

    The mobile notifications delay is now down to 2.5 minutes, and we expect to clear that backlog shortly. We will continue to monitor the situation.

  4. resolved Dec 04, 2023, 10:24 PM UTC

    This incident has been resolved.

  5. postmortem Dec 05, 2023, 08:00 PM UTC

    Push notifications in Zulip are done by a background worker, which talks to Apple and Google servers to notify applications on users' mobile devices. On Monday, the rate of push notifications generated by the Zulip Cloud service surpassed the rate which a single worker could send those notifications, leading to a backlog of notifications. This was further worsened by other load on the system, which compounded the backlog, leading to delays of up to 10 minutes between when a push notification was triggered, and when it was sent to users' mobile devices. We have since split the workers which deliver these notifications, allowing us to process many more in parallel.