Datadog Govcloud experienced a critical incident on August 4, 2021 affecting APM and Log Management and 1 more component, lasting 3h 9m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Aug 04, 2021, 04:57 PM UTC
We’re investigating increased metric latencies and API errors. Graphs may be delayed. To avoid spurious alerts, we’ve temporarily disabled alerts for Metrics Monitors.
- identified Aug 04, 2021, 05:30 PM UTC
We have identified a performance issue affecting Metrics, Logs, APM and alerting. We are working to restore access.
- identified Aug 04, 2021, 06:29 PM UTC
The web application is responding without errors. We are still working to restore timely processing of tracing, logs, metrics and alerts.
- identified Aug 04, 2021, 07:24 PM UTC
Realtime data should be available for all customers and we are actively working to restore delayed historical data. Most metric and logs based monitors are restored and we are working to restore the remainder shortly.
- identified Aug 04, 2021, 07:35 PM UTC
All crawler based metrics are fully restored and current.
- monitoring Aug 04, 2021, 07:49 PM UTC
All metrics based monitors are fully restored and we are continuing to restore log based monitors as we work through relevant backlogs.
- resolved Aug 04, 2021, 08:07 PM UTC
This incident has been resolved.