SignalFx EU0 incident

Charts and Detectors Delayed

Major Resolved View vendor source →

SignalFx EU0 experienced a major incident on October 11, 2024 affecting Splunk Observability Cloud Web Interface and Alerting, lasting 1d 8h. The incident has been resolved; the full update timeline is below.

Started
Oct 11, 2024, 06:22 PM UTC
Resolved
Oct 13, 2024, 02:33 AM UTC
Duration
1d 8h
Detected by Pingoru
Oct 11, 2024, 06:22 PM UTC

Affected components

Splunk Observability Cloud Web InterfaceAlerting

Update timeline

  1. identified Oct 11, 2024, 06:22 PM UTC

    Customers may be experiencing delays in charts and detectors that rely on property and tag updates on metric time series. Datapoint ingest is not affected. We have identified the issue and are actively working on implementing the fix.

  2. identified Oct 11, 2024, 07:30 PM UTC

    We are continuing to make progress on a fix for this issue.

  3. identified Oct 11, 2024, 09:36 PM UTC

    The fix has been implemented and is in the process of being deployed.

  4. identified Oct 11, 2024, 11:59 PM UTC

    We are continuing to deploy the fix and will provide further updates as it starts taking effect.

  5. identified Oct 12, 2024, 02:09 AM UTC

    The fix has been deployed. We are monitoring it and will continue providing updates.

  6. identified Oct 12, 2024, 04:16 AM UTC

    We are in the process of implementing additional fixes and will continue to provide updates

  7. identified Oct 12, 2024, 06:12 AM UTC

    Additional fixes are now implemented and are in the process of being deployed. We will continue to provide updates.

  8. identified Oct 12, 2024, 09:09 AM UTC

    We are continuing to deploy the additional fixes and will provide further updates as they start taking effect

  9. identified Oct 12, 2024, 11:07 AM UTC

    Additional fixes are now deployed and starting to take effect. We will continue to monitor and provide updates

  10. identified Oct 12, 2024, 01:47 PM UTC

    The fixes continue to take effect and we expect the system to recover at a steady pace over the next few hours. We will continue to monitor and provide updates

  11. identified Oct 12, 2024, 04:42 PM UTC

    The recovery is ongoing at a steady pace. We will continue to monitor and provide updates.

  12. identified Oct 12, 2024, 09:32 PM UTC

    The recovery has been progressing as expected and will continue over the next few hours. We will continue to provide updates

  13. monitoring Oct 13, 2024, 02:07 AM UTC

    The system has now recovered, and we are continuing to monitor.

  14. resolved Oct 13, 2024, 02:33 AM UTC

    This incident has been resolved.