Datadog Govcloud experienced a critical incident on November 2, 2021 affecting APM and Log Management and 1 more component, lasting 3h 16m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Nov 02, 2021, 05:43 PM UTC
We are investigating loading issues on our web application. As a result, some users might be getting errors when loading the web application.
- identified Nov 02, 2021, 06:14 PM UTC
We are continuing to investigate this issue with our provider. Network connectivity issues in the region are still causing issues loading the application, delaying data and alerts in the Datadog Govcloud region.
- identified Nov 02, 2021, 06:50 PM UTC
Network connectivity issues in the region are still causing issues loading the application, delaying data and alerts in the Datadog Govcloud region. Our provider has acknowledged the issue and is working to resolve it.
- identified Nov 02, 2021, 07:28 PM UTC
Network connectivity issues in the region continuing to cause issues loading the application, delaying data and alerts in the Datadog Govcloud region. Our provider has acknowledged the issue and is working to resolve it. We have also begun our own mitigations.
- identified Nov 02, 2021, 07:50 PM UTC
Our provider has resolved the underlying network issue. We are now scaling up our systems to handle the backlog.
- resolved Nov 02, 2021, 09:00 PM UTC
We are now recovered for live data and monitors. At this point, customers might still be seeing gaps in metrics data between 1:25 and 3:25 EDT, we will be following up with specific affected customers through our usual support channels.