Courier incident

Delay in Message Processing

Major Resolved View vendor source →

Courier experienced a major incident on August 19, 2024 affecting Web Application and API and 1 more component, lasting 3h 3m. The incident has been resolved; the full update timeline is below.

Started
Aug 19, 2024, 06:32 PM UTC
Resolved
Aug 19, 2024, 09:35 PM UTC
Duration
3h 3m
Detected by Pingoru
Aug 19, 2024, 06:32 PM UTC

Affected components

Web ApplicationAPIObservability

Update timeline

  1. identified Aug 19, 2024, 06:32 PM UTC

    The Courier team identified an issue in our health monitoring involving our message event processing. The issue has been identified and a revert is in place.

  2. monitoring Aug 19, 2024, 06:48 PM UTC

    Our team has released a revert to address the regression and it's in the process of merging.

  3. monitoring Aug 19, 2024, 06:50 PM UTC

    The release is live, and the team is monitoring it.

  4. monitoring Aug 19, 2024, 06:56 PM UTC

    Release is published and building to production. ETA ~45 minutes.

  5. monitoring Aug 19, 2024, 08:02 PM UTC

    Fix has been deployed, and enqueued messages have started to go through slowly. Once the bottleneck clears, messages should start to flow normally.

  6. resolved Aug 19, 2024, 09:35 PM UTC

    The general pipeline has recovered.