Splunk Observability Cloud US2 incident

Splunk APM Troubleshooting MetricSets Delayed

Minor Resolved View vendor source →

Splunk Observability Cloud US2 experienced a minor incident on June 26, 2024 affecting Splunk APM Troubleshooting MetricSets and Splunk APM Tag Spotlight and 1 more component, lasting 8h 46m. The incident has been resolved; the full update timeline is below.

Started
Jun 26, 2024, 10:32 AM UTC
Resolved
Jun 26, 2024, 07:18 PM UTC
Duration
8h 46m
Detected by Pingoru
Jun 26, 2024, 10:32 AM UTC

Affected components

Splunk APM Troubleshooting MetricSetsSplunk APM Tag SpotlightSplunk APM Business Workflows

Update timeline

  1. monitoring Jun 26, 2024, 10:32 AM UTC

    A degradation in the performance of the Splunk APM trace processing pipeline is causing Troubleshooting MetricSets to be delayed by more than five minutes. As a result, the APM Troubleshooting experience, service maps and Tag Spotlight do not have access to the most recent data. The processing of metrics for Business Workflows, which also depends on this pipeline, are equally delayed. Trace data ingest is not impacted at this time; service-level and endpoint-level Monitoring MetricSets and the detectors built from them are also not impacted. This has been occurring since 12:39AM PT. The issue has been identified and remediated, and systems should be recovered within the next hour. We will continue to monitor and provide periodic updates.

  2. monitoring Jun 26, 2024, 11:31 AM UTC

    The issue has been identified and remediated. It did not impact trace data ingest, service-level and endpoint-level Monitoring MetricSets. We are continuing to monitor for further issues.

  3. monitoring Jun 26, 2024, 12:30 PM UTC

    The issue has been identified and remediated, and we are waiting for confirmation of the full resolution. It did not impact trace data ingest, service-level and endpoint-level Monitoring MetricSets. We are continuing to monitor for further issues.

  4. monitoring Jun 26, 2024, 01:30 PM UTC

    The issue has been identified and remediated, and we are still pending confirmation of the full resolution. It did not impact trace data ingest, service-level and endpoint-level Monitoring MetricSets. We are continuing to monitor for further issues.

  5. monitoring Jun 26, 2024, 02:30 PM UTC

    The issue has been identified and remediated, and we are still pending confirmation of the full resolution. It did not impact trace data ingest, service-level and endpoint-level Monitoring MetricSets. We are continuing to monitor for further issues.

  6. monitoring Jun 26, 2024, 03:27 PM UTC

    The issue has been identified and remediated, and we are still pending confirmation of the full resolution. It did not impact trace data ingest, service-level and endpoint-level Monitoring MetricSets. We are continuing to monitor for further issues.

  7. monitoring Jun 26, 2024, 04:25 PM UTC

    The issue has been identified and remediated, and we are still pending confirmation of the full resolution. It did not impact trace data ingest, service-level and endpoint-level Monitoring MetricSets. We are continuing to monitor for further issues.

  8. monitoring Jun 26, 2024, 05:27 PM UTC

    The issue has been identified and remediated, and we are still pending confirmation of the full resolution. It did not impact trace data ingest, service-level and endpoint-level Monitoring MetricSets. We are continuing to monitor for further issues.

  9. resolved Jun 26, 2024, 07:18 PM UTC

    This incident has been resolved.

  10. resolved Jun 26, 2024, 07:18 PM UTC

    This incident has been resolved.