SchemeServe experienced a critical incident on April 8, 2024 affecting 🎩 SchemeServe, lasting 55m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Apr 08, 2024, 08:23 AM UTC
We are investigating a problem with logging into SchemeServe, where some users may experience error messages and be unable to log in.
- resolved Apr 08, 2024, 09:19 AM UTC
This incident has been resolved.
- postmortem Apr 08, 2024, 02:51 PM UTC
This morning we had issues with our Authentication API failing to serve login requests. The cause was resource exhaustion, which was caused by a condition whereby scaling up the resource itself triggered a scale up event. This loop continued until the available resources were exhausted. In light of this we have: 1. Changed the scale event triggers so that one scale event does not trigger another. 2. Set up alerts to notify us when the symptoms begin to present themselves so we are able to take action before issues occur.