Datadog AP1 incident

Metric monitor evaluations and Log Archives degraded

Major Resolved View vendor source →

Datadog AP1 experienced a major incident on August 29, 2024 affecting Log Management and Metrics and Infra Monitoring, lasting 22m. The incident has been resolved; the full update timeline is below.

Started
Aug 29, 2024, 09:42 AM UTC
Resolved
Aug 29, 2024, 10:04 AM UTC
Duration
22m
Detected by Pingoru
Aug 29, 2024, 09:42 AM UTC

Affected components

Log ManagementMetrics and Infra Monitoring

Update timeline

  1. investigating Aug 29, 2024, 09:42 AM UTC

    We are experiencing issues with processing cloud integrations which is resulting in delayed integration metrics for aws.* metrics and issues with Logs Archives. We have disabled notifications relying on these metrics. We are investigating the issue and will provide additional information as it becomes available.

  2. identified Aug 29, 2024, 09:55 AM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Aug 29, 2024, 10:01 AM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Aug 29, 2024, 10:04 AM UTC

    This incident has been resolved.