Applicaster incident
Processing actions fails (layouts changes)
Applicaster experienced a notice incident on November 15, 2023 affecting Studio, lasting 1h 52m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Nov 15, 2023, 02:23 PM UTC
We are currently investigating this issue.
- identified Nov 15, 2023, 02:55 PM UTC
The issue has been identified and a fix is being implemented.
- monitoring Nov 15, 2023, 03:07 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Nov 15, 2023, 04:16 PM UTC
This incident has been resolved.
- postmortem Nov 17, 2023, 03:06 PM UTC
There was a leak between the new cluster \(private\) that was created. Sidekiq \(bg jobs processing\) is connected to the same Redis DB on all clusters. The new cluster didn’t have the Zapp app running properly \(pods\) and therefore jobs weren’t processing Once we found the issue, we downscaled the new private cluster to 0, increased node size on AWS console, and restarted the pods on the prod-us1 cluster \(the running production cluster\). In order to prevent it in the future, we need to make sure there is no running nodes on redundant clusters, only the active one. In addition, we should consider Redis DB separation between clusters, although this could cause losing some of the running processes.