Datadog US3 experienced a major incident on November 2, 2023 affecting Metrics and Infra Monitoring and Monitors, lasting 30m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Nov 02, 2023, 10:55 AM UTC
We have identified an issue which caused temporarily elevated error rates on our web application (500 error pages) and increased latency processing monitoring data. As a result of this issue, customers might still some metrics are delayed as well as monitors relying on this data. We are currently working on a fix and the data is being backfilled. We will provide another update once the service is fully operational again.
- identified Nov 02, 2023, 11:02 AM UTC
We are continuing to work on a fix for this issue.
- monitoring Nov 02, 2023, 11:12 AM UTC
All components have recovered now. We are replaying a few failed notifications.
- resolved Nov 02, 2023, 11:25 AM UTC
This incident has been resolved.