Datadog US1 Outage History

Datadog US1 had 40 outages in the last 2 years totaling 66h 53m of downtime — averaging 1.6 incidents per month.

There were 40 Datadog US1 outages since June 24, 2025 totaling 66h 53m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.datadoghq.com

Minor November 17, 2025

Dashboards Not Loading

Detected by Pingoru: Nov 17, 2025, 05:20 PM UTC
Resolved: Nov 17, 2025, 05:20 PM UTC
Duration: —

Timeline · 1 update

resolved Nov 17, 2025, 05:20 PM UTC

Nov 17, 12:20 EST Resolved - All errors stopped as of 12:02ET. This incident has been resolved. Nov 17, 12:12 EST Monitoring - The rollout with a fix is in progress, and we're no longer seeing errors, and are currently monitoring the incident and we are on the path of recovery. Nov 17, 11:40 EST Identified - The issue has been identified and we taking measures to mitigate the issue, as well as working on a fix. Nov 17, 11:23 EST Investigating - We are investigating loading issues on the dashboard pages. As a result, some users might be getting errors when loading the web application. Please note that data processing and alerts are not affected by this incident.

Read the full incident report →

Minor November 5, 2025

Delayed Metrics for APM and distribution metrics

Detected by Pingoru: Nov 05, 2025, 05:57 PM UTC
Resolved: Nov 05, 2025, 05:57 PM UTC
Duration: —

Timeline · 1 update

resolved Nov 05, 2025, 05:57 PM UTC

Nov 5, 12:57 EST Resolved - APM metrics are now processing live. Nov 5, 11:59 EST Update - The distribution metrics should be back to processing live, without latency. APM metrics are still being delayed, and we're actively working at getting it back to live. Nov 5, 11:25 EST Update - We have identified the root cause, and scaled up the processing to catch up with the lag. Nov 5, 10:42 EST Identified - We are investigating increased latency processing Metrics from APM and distribution metrics. As a result of this issue, some users may see delays or gaps for metrics on graphs. To prevent spurious alerts, we have temporarily disabled monitors based on this data.

Read the full incident report →

Minor October 22, 2025

Multiple products impacted with data delays

Detected by Pingoru: Oct 22, 2025, 02:40 PM UTC
Resolved: Oct 22, 2025, 02:40 PM UTC
Duration: —

Timeline · 1 update

Read the full incident report →

Critical October 14, 2025

Delayed AWS, GCP, Azure, SaaS integrations Metrics and Logs

Detected by Pingoru: Oct 14, 2025, 05:14 PM UTC
Resolved: Oct 14, 2025, 07:19 PM UTC
Duration: 2h 4m

Affected: Metrics and Infra Monitoring

Timeline · 5 updates

investigating Oct 14, 2025, 05:14 PM UTC

We are investigating increased latency processing AWS, GCP and Azure Metrics. As a result of this issue, some users may see delays or gaps in graphs that contain these metrics. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
identified Oct 14, 2025, 05:29 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Oct 14, 2025, 05:43 PM UTC

A fix has been implemented, and we are monitoring the results.
monitoring Oct 14, 2025, 06:44 PM UTC

Data flow has been restored for new incoming data. We are currently backfilling historical data.
resolved Oct 14, 2025, 07:19 PM UTC

This incident has been resolved.

Read the full incident report →

Minor September 29, 2025

Host Tags, Service Checks, and Datadog Events Delayed Evaluation

Detected by Pingoru: Sep 29, 2025, 06:17 PM UTC
Resolved: Sep 29, 2025, 07:43 PM UTC
Duration: 1h 25m

Affected: Metrics and Infra MonitoringMonitors

Timeline · 4 updates

investigating Sep 29, 2025, 06:17 PM UTC

We’re currently investigating an issue causing delayed processing of host tag updates, Service Checks and Datadog Events, which may result in stale data appearing in the frontend. Our team is actively working to mitigate and fully resolve this. I’ll follow up as soon as the issue has been resolved. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
identified Sep 29, 2025, 06:20 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Sep 29, 2025, 07:36 PM UTC

Service Check and Datadog Events monitor evaluation has recovered and data is up-to-date. Host tag updates are still recovering and stale host tags may appear in the frontend.
resolved Sep 29, 2025, 07:43 PM UTC

This incident has been resolved.

Read the full incident report →

Major September 18, 2025

[SSO] Login Errors from Google SSO

Detected by Pingoru: Sep 18, 2025, 03:11 PM UTC
Resolved: Sep 18, 2025, 04:24 PM UTC
Duration: 1h 13m

Affected: Web Application

Timeline · 3 updates

investigating Sep 18, 2025, 03:11 PM UTC

We are investigating user login issues with the web application via Google SSO. Please note that data processing and alerts are not affected by this incident.
monitoring Sep 18, 2025, 03:47 PM UTC

We are seeing recovery in Google SSO logins. We are continuing to monitor for issues.
resolved Sep 18, 2025, 04:24 PM UTC

This incident has been resolved.

Read the full incident report →

Minor August 28, 2025

Periodic network interruption communicating with multiple Azure regions

Detected by Pingoru: Aug 28, 2025, 07:30 PM UTC
Resolved: Aug 30, 2025, 06:58 PM UTC
Duration: 1d 23h

Affected: APMLog ManagementMetrics and Infra MonitoringMonitors

Timeline · 5 updates

investigating Aug 28, 2025, 07:30 PM UTC

Degraded network capacity in an Azure datacenter is causing network packet loss and increased latency when communicating with AWS in eastern US regions. Customers may experience communications failures trying to submit data from agents running in AWS and may experience delayed data from AWS integrations. Microsoft has identified the root cause and is working on mitigations.
monitoring Aug 28, 2025, 08:37 PM UTC

The root cause of the issue has been identified and Microsoft has implemented mitigations, we are monitoring network traffic to confirm.
monitoring Aug 29, 2025, 08:47 PM UTC

Azure has temporarily mitigated the network capacity issues which have caused episodic packet loss for customers who are hosted in Azure data centers and are using Datadog’s US1 region (accessible via https://app.datadoghq.com). Azure engineers are continuing to work to fully resolve this issue. Until they fully resolve the issue, customers with Datadog agents running in Azure data centers may see brief periods of delayed ingestion of data from agents and from Azure integrations. We don’t expect a noticeable impact thanks to agent buffering but cannot exclude the possibility of spurious alerts due to temporarily delayed data. We are continuing to monitor the situation in conjunction with Azure, and will do so throughout the weekend. We will post status page updates as soon as the situation improves, and at least every 24 hours. We thank you for your patience throughout this incident.
monitoring Aug 30, 2025, 03:12 AM UTC

Azure has implemented a permanent fix to the network issue. Both Azure and Datadog engineers are continuing to monitor overnight and will provide an update tomorrow.
resolved Aug 30, 2025, 06:58 PM UTC

Our monitoring has shown Azure’s fix to be stable since our last update. This incident has been resolved.

Read the full incident report →

Notice August 28, 2025

Pagerduty Monitor Notifications Delayed

Detected by Pingoru: Aug 28, 2025, 04:39 AM UTC
Resolved: Aug 28, 2025, 09:05 AM UTC
Duration: 4h 26m

Affected: Monitors

Timeline · 3 updates

investigating Aug 28, 2025, 04:39 AM UTC

Monitor Notifications are delayed for Pagerduty.
monitoring Aug 28, 2025, 08:03 AM UTC

We are observing some recovery of notifications delays and continue to monitor the situation. Please follow our integration status page for details https://datadogintegrations.statuspage.io/
resolved Aug 28, 2025, 09:05 AM UTC

PagerDuty notifications deliveries are back to normal.

Read the full incident report →

Minor August 27, 2025

Partial metrics drop from Datadog Agent in the westus2 azure region to Datadog us1 datacenter

Detected by Pingoru: Aug 27, 2025, 05:58 PM UTC
Resolved: Aug 27, 2025, 08:53 PM UTC
Duration: 2h 54m

Affected: Metrics and Infra Monitoring

Timeline · 2 updates

monitoring Aug 27, 2025, 05:58 PM UTC

We noticed partial metrics drop from Datadog Agent in the westus2 azure region to Datadog us1 datacenter. We are actively investigating the case.
resolved Aug 27, 2025, 08:53 PM UTC

We noticed partial data drop from Datadog Agent in the westus2 azure region to Datadog us1 datacenter. There is no data drop anymore, we are monitoring the situation.

Read the full incident report →

Minor August 21, 2025

Duplicate Logs in Aggregated Queries

Detected by Pingoru: Aug 21, 2025, 09:57 PM UTC
Resolved: Aug 21, 2025, 11:40 PM UTC
Duration: 1h 43m

Affected: Log ManagementRUM

Timeline · 4 updates

investigating Aug 21, 2025, 09:57 PM UTC

We are investigating an issue processing Logs. As a result of this issue, some users may see inconsistencies in logs queries.
identified Aug 21, 2025, 11:29 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Aug 21, 2025, 11:35 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Aug 21, 2025, 11:40 PM UTC

This incident has been resolved.

Read the full incident report →

Minor August 5, 2025

Degraded Web Application Degraded

Detected by Pingoru: Aug 05, 2025, 04:38 PM UTC
Resolved: Aug 05, 2025, 05:36 PM UTC
Duration: 57m

Affected: Web Application

Timeline · 4 updates

investigating Aug 05, 2025, 04:38 PM UTC

Due to an issue with access controls failures we're seeing downstream impact to multiple products. Our team is actively working on identifying root cause and to resolve the issue. We will be providing a more specific update shortly.
investigating Aug 05, 2025, 04:39 PM UTC

We are continuing to investigate this issue.
monitoring Aug 05, 2025, 04:44 PM UTC

We have identified the issue and implemented a fix, we are monitoring the recovery of the impacted products.
resolved Aug 05, 2025, 05:36 PM UTC

This incident has been resolved.

Read the full incident report →

Major July 18, 2025

Google SSO login errors

Detected by Pingoru: Jul 18, 2025, 03:30 PM UTC
Resolved: Jul 18, 2025, 04:09 PM UTC
Duration: 38m

Affected: Web Application

Timeline · 3 updates

investigating Jul 18, 2025, 03:30 PM UTC

We are investigating user login issues with the web application via Google SSO. Please note that data processing and alerts are not affected by this incident.
identified Jul 18, 2025, 03:41 PM UTC

Google declared an incident regarding this issue: https://www.google.com/appsstatus/dashboard/incidents/oFcAZTr4EVieF5Fr6Ee9
resolved Jul 18, 2025, 04:09 PM UTC

This incident has been resolved.

Read the full incident report →

Notice July 9, 2025

Degraded Web Application Performance & Monitor Evaluations

Detected by Pingoru: Jul 09, 2025, 08:46 PM UTC
Resolved: Jul 09, 2025, 09:46 PM UTC
Duration: 1h

Affected: Cloud Cost ManagementMonitorsWeb Application

Timeline · 4 updates

investigating Jul 09, 2025, 08:46 PM UTC

We're investigating an issue with our metrics and monitor evaluations, causing degraded web application performance and skipped monitors
identified Jul 09, 2025, 09:06 PM UTC

We've identified a possible root cause and we're actively working on mitigating the impact
monitoring Jul 09, 2025, 09:26 PM UTC

We've implemented a fix and we're seeing recovery in monitor evaluations and dashboards, we'll continue to investigate and monitor for further impact
resolved Jul 09, 2025, 09:46 PM UTC

This incident has been resolved.

Read the full incident report →

Major July 7, 2025

Monitors - Delayed Evaluation of logs monitors

Detected by Pingoru: Jul 07, 2025, 02:39 PM UTC
Resolved: Jul 07, 2025, 05:07 PM UTC
Duration: 2h 28m

Affected: Monitors

Timeline · 3 updates

investigating Jul 07, 2025, 02:39 PM UTC

We are investigating delays in Monitors Evaluation of logs based monitors., which began at 01:30:00 PM UTC.
monitoring Jul 07, 2025, 03:27 PM UTC

The team rolled out a change and has been seeing recovery. The team will continue monitoring for a period of time.
resolved Jul 07, 2025, 05:07 PM UTC

This incident has been resolved.

Read the full incident report →

Major June 24, 2025

Logs Monitors - Delayed Evaluations

Detected by Pingoru: Jun 24, 2025, 09:16 PM UTC
Resolved: Jun 24, 2025, 09:48 PM UTC
Duration: 31m

Affected: Monitors

Timeline · 3 updates

investigating Jun 24, 2025, 09:16 PM UTC

We are investigating delays in Logs Monitors Evaluations, which began at 8:46 PM UTC.
identified Jun 24, 2025, 09:32 PM UTC

The issue has been identified and a fix is being implemented.
resolved Jun 24, 2025, 09:48 PM UTC

This incident has been resolved.

Read the full incident report →