Pandium experienced a major incident on July 14, 2024 affecting API and Runs, lasting 14h 36m. The incident has been resolved; the full update timeline is below.
Affected components
APIRuns
Update timeline
- investigating Jul 14, 2024, 07:06 AM UTC
Underlying Control Plane under high load. Investigating. Run and API are running slow.
- identified Jul 14, 2024, 03:53 PM UTC
Issue is with the control plane of underlying hosting provider. We are working with them on mitigation.
- identified Jul 14, 2024, 04:35 PM UTC
The underlying issue has been identified. Ectd cluster of the control plane is having a fit. Working on mitigation.
- monitoring Jul 14, 2024, 05:37 PM UTC
A fix has been implemented. We are now monitoring recovery.
- resolved Jul 14, 2024, 09:42 PM UTC
Scheduled run runs have caught up, and we have not experienced any more issues with control plane.