Applicaster incident

Processing actions fails (layouts changes)

Notice Resolved View vendor source →

Applicaster experienced a notice incident on November 15, 2023 affecting Studio, lasting 1h 52m. The incident has been resolved; the full update timeline is below.

Started
Nov 15, 2023, 02:23 PM UTC
Resolved
Nov 15, 2023, 04:16 PM UTC
Duration
1h 52m
Detected by Pingoru
Nov 15, 2023, 02:23 PM UTC

Affected components

Studio

Update timeline

  1. investigating Nov 15, 2023, 02:23 PM UTC

    We are currently investigating this issue.

  2. identified Nov 15, 2023, 02:55 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Nov 15, 2023, 03:07 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Nov 15, 2023, 04:16 PM UTC

    This incident has been resolved.

  5. postmortem Nov 17, 2023, 03:06 PM UTC

    There was a leak between the new cluster \(private\) that was created. Sidekiq \(bg jobs processing\) is connected to the same Redis DB on all clusters. The new cluster didn’t have the Zapp app running properly \(pods\) and therefore jobs weren’t processing Once we found the issue, we downscaled the new private cluster to 0, increased node size on AWS console, and restarted the pods on the prod-us1 cluster \(the running production cluster\). In order to prevent it in the future, we need to make sure there is no running nodes on redundant clusters, only the active one. In addition, we should consider Redis DB separation between clusters, although this could cause losing some of the running processes.