Splunk Observability Cloud AU0 incident

Splunk APM Troubleshooting and Monitoring MetricSets Delayed

Minor Resolved View vendor source →

Splunk Observability Cloud AU0 experienced a minor incident on February 25, 2024 affecting Splunk APM Monitoring MetricSets and Splunk APM Troubleshooting MetricSets and 1 more component, lasting 1h 25m. The incident has been resolved; the full update timeline is below.

Started
Feb 25, 2024, 01:00 AM UTC
Resolved
Feb 25, 2024, 02:26 AM UTC
Duration
1h 25m
Detected by Pingoru
Feb 25, 2024, 01:00 AM UTC

Affected components

Splunk APM Monitoring MetricSetsSplunk APM Troubleshooting MetricSetsSplunk APM Trace DataSplunk APM Tag SpotlightSplunk APM Business Workflows

Update timeline

  1. investigating Feb 25, 2024, 01:55 AM UTC

    A degradation in the performance of the Splunk APM trace processing pipeline is causing Troubleshooting and monitoring MetricSets to be delayed by more than five minutes. As a result, the APM Troubleshooting experience, service maps and Tag Spotlight do not have access to the most recent data. The processing of metrics for Business Workflows, which also depends on this pipeline, is equally delayed. Service-level endpoint, workflow dashboards, and other charts and detectors built from Monitoring MetricSets are impacted.

  2. identified Feb 25, 2024, 02:03 AM UTC

    The issue has been identified and a fix is being implemented.

  3. identified Feb 25, 2024, 02:04 AM UTC

    We are continuing to work on a fix for this issue.

  4. resolved Feb 25, 2024, 02:26 AM UTC

    This incident has been resolved.