Splunk Observability Cloud JP0 incident

Splunk APM Monitoring MetricSets delayed

Major Resolved View vendor source →

Splunk Observability Cloud JP0 experienced a major incident on November 3, 2023 affecting Splunk APM Interface and Splunk APM Monitoring MetricSets, lasting 3h 1m. The incident has been resolved; the full update timeline is below.

Started
Nov 03, 2023, 09:35 PM UTC
Resolved
Nov 04, 2023, 12:37 AM UTC
Duration
3h 1m
Detected by Pingoru
Nov 03, 2023, 09:35 PM UTC

Affected components

Splunk APM InterfaceSplunk APM Monitoring MetricSets

Update timeline

  1. investigating Nov 03, 2023, 09:35 PM UTC

    A degradation in the performance of the Splunk APM metrics processing pipeline is causing Monitoring MetricSets to be delayed by more than five minutes. Trace data ingest is not impacted, but service, endpoint and workflow dashboards, and other charts and detectors built from Monitoring MetricSets are impacted.

  2. investigating Nov 03, 2023, 09:57 PM UTC

    We are continuing to investigate the issue

  3. monitoring Nov 03, 2023, 10:11 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. monitoring Nov 03, 2023, 10:16 PM UTC

    We are continuing to monitor for any further issues.

  5. monitoring Nov 04, 2023, 12:08 AM UTC

    We are continuing to monitor for any further issues.

  6. resolved Nov 04, 2023, 12:37 AM UTC

    This incident has been resolved.