Technolutions experienced a notice incident on February 6, 2024 affecting Slate, lasting 1d 11h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- monitoring Feb 06, 2024, 01:25 PM UTC
An AWS-initiated overnight update to a series of Redis servers used for background service queue management has delayed background queue processing as the queues entered a stalled state. Delayed queues are processing now and will complete shortly.
- monitoring Feb 06, 2024, 10:14 PM UTC
As of earlier this afternoon, fewer than 1% of databases had ongoing overnight jobs still executing. The remaining databases have jobs that typically take hours overnight to complete, and with the accumulation of daytime jobs and blocking processes on their databases, are continuing to run. Separately, we have adjusted the background process monitoring to better recover from a scenario in which the Redis servers become temporarily unavailable due to hardware or software updates to those nodes.
- resolved Feb 08, 2024, 12:57 AM UTC
No issues have been observed since the overnight background service queue delays on 2/6, and all overnight queues on 2/7 performed as expected. We will continue to monitor background services to ensure ongoing operation within desired ranges.