Pandium incident

Runs failing

Critical Resolved View vendor source →

Pandium experienced a critical incident on August 1, 2022 affecting Runs, lasting 4h 43m. The incident has been resolved; the full update timeline is below.

Started
Aug 01, 2022, 12:01 PM UTC
Resolved
Aug 01, 2022, 04:44 PM UTC
Duration
4h 43m
Detected by Pingoru
Aug 01, 2022, 12:01 PM UTC

Affected components

Runs

Update timeline

  1. investigating Aug 01, 2022, 12:01 PM UTC

    We are currently investigating an issue that is causing some runs to fail.

  2. investigating Aug 01, 2022, 01:24 PM UTC

    We are continuing to investigate this issue.

  3. investigating Aug 01, 2022, 01:55 PM UTC

    We have restored a key service and are continuing to investigate the underlying issue

  4. identified Aug 01, 2022, 03:09 PM UTC

    This issue has been identified and we are working on a fix.

  5. monitoring Aug 01, 2022, 03:24 PM UTC

    A fix has been deployed that has mitigated the issue. There seems to have been a issue where stale configs have been applied. We are continuing to monitor as runs come back online.

  6. resolved Aug 01, 2022, 04:44 PM UTC

    Runs are successfully running. We will continuing to investigating root causes, but immediate issue has been resolved.