Scout APM incident

Delayed Ingestion for Some Customers

Critical Resolved View vendor source →

Scout APM experienced a critical incident on November 20, 2019 affecting Application Monitoring, lasting 1d 2h. The incident has been resolved; the full update timeline is below.

Started
Nov 20, 2019, 01:42 PM UTC
Resolved
Nov 21, 2019, 04:04 PM UTC
Duration
1d 2h
Detected by Pingoru
Nov 20, 2019, 01:42 PM UTC

Affected components

Application Monitoring

Update timeline

  1. investigating Nov 20, 2019, 01:42 PM UTC

    One of our ingestion servers is falling a little behind, so ingestion for any customers on that server will be delayed. All data is safe and is being processed.

  2. investigating Nov 20, 2019, 02:23 PM UTC

    We've identified a handful of incoming messages that have slowed our ingestion processing, causing it to fall behind. This has caused tripped circuit breakers in other parts of our app. All data is stored, but ingestion as a whole is paused

  3. monitoring Nov 20, 2019, 03:45 PM UTC

    We've isolated the issue to a single account. We're in contact with that customer and have restarted ingestion for all other accounts.

  4. resolved Nov 21, 2019, 04:04 PM UTC

    This incident has been resolved.