Strigo incident

High memory consumption making the app inaccessible

Critical Resolved View vendor source →

Strigo experienced a critical incident on February 21, 2023 affecting Website and Strigo service, lasting 5h 1m. The incident has been resolved; the full update timeline is below.

Started
Feb 21, 2023, 08:35 AM UTC
Resolved
Feb 21, 2023, 01:37 PM UTC
Duration
5h 1m
Detected by Pingoru
Feb 21, 2023, 08:35 AM UTC

Affected components

WebsiteStrigo service

Update timeline

  1. investigating Feb 21, 2023, 08:52 AM UTC

    We've encountered an issue making the app inaccessible. Our engineers are working on resolving the issue

  2. investigating Feb 21, 2023, 08:52 AM UTC

    We are continuing to investigate this issue.

  3. investigating Feb 21, 2023, 08:54 AM UTC

    Resources were reallocated and the app is up and running. We're monitoring and investigating the source that caused the issue.

  4. investigating Feb 21, 2023, 01:34 PM UTC

    We are continuing to monitor the provided resolution.

  5. resolved Feb 21, 2023, 01:37 PM UTC

    We encountered a CPU and memory usage spike in our services. This is suspected to be a result of a network connection issue that is now resolved. After 4 hours of monitoring we feel safe to mark the issue as resolved.

  6. postmortem Feb 22, 2023, 08:47 AM UTC

    Yesterday \(the 21st of February, at 12:35 AM\), we experienced downtime in our Classroom and Lounge applications that lasted 17m. We noticed that our system was overall stable but that our main classroom app’s containers were showing high CPU usage across the board, which prevented our customers from accessing the app. After a rather deep investigation, we conclude \(without hard evidence\) that this was probably related to network issues between our clients and the our datacenters, causing \(due to the reactive nature of our app\) the clients to bombard are main app with requests. We didn’t see any issues with our database, load balancers, or scheduling systems. We’ve taken steps to ensure that if our main app isn’t available, it will be restarted automatically by our scheduler to clear up those requests. We apologize for the inconvenience.