Spacelift incident

Delayed handling of events

Minor Resolved View vendor source →

Spacelift experienced a minor incident on October 30, 2024 affecting Event processing and Public workers, lasting 1h 12m. The incident has been resolved; the full update timeline is below.

Started
Oct 30, 2024, 03:40 PM UTC
Resolved
Oct 30, 2024, 04:53 PM UTC
Duration
1h 12m
Detected by Pingoru
Oct 30, 2024, 03:40 PM UTC

Affected components

Event processingPublic workers

Update timeline

  1. investigating Oct 30, 2024, 03:40 PM UTC

    We are seeing an increased error rate when handling webhooks for all customers. This is causing a delay in runs being scheduled.

  2. monitoring Oct 30, 2024, 04:13 PM UTC

    We have introduced a quick fix for handling events.

  3. identified Oct 30, 2024, 04:28 PM UTC

    We have identified an issue with our object storage abstraction, we are rolling back to a previous release.

  4. resolved Oct 30, 2024, 04:53 PM UTC

    Our monitors are reporting that all events have been processed. The issue has been resolved.