Scout APM incident

Agent payloads being rejected, missing dashboard data

Critical Resolved View vendor source →

Scout APM experienced a critical incident on March 3, 2022 affecting Application Monitoring, lasting 7h 14m. The incident has been resolved; the full update timeline is below.

Started
Mar 03, 2022, 10:15 PM UTC
Resolved
Mar 04, 2022, 05:29 AM UTC
Duration
7h 14m
Detected by Pingoru
Mar 03, 2022, 10:15 PM UTC

Affected components

Application Monitoring

Update timeline

  1. identified Mar 03, 2022, 10:15 PM UTC

    We have encountered an issue with our Kafka cluster preventing agent payloads from being recorded into kafka for storage and processing. You will not see current data in your dashboards until the issue is resolved. We are working on deploying the fix now.

  2. monitoring Mar 03, 2022, 10:51 PM UTC

    We have implemented the fix and the kafka cluster is operating normally. Agent checkin payloads are being ingested and processed again as of 3:41PM MT. Data from 2:40-3:30PM MT will not backfill to charts.

  3. resolved Mar 04, 2022, 05:29 AM UTC

    This incident has been resolved.