Strigo incident

Classroom Labs Outage

Major Resolved View vendor source →

Strigo experienced a major incident on August 11, 2020 affecting Strigo service and AWS EC2 (eu-west-1), lasting 53m. The incident has been resolved; the full update timeline is below.

Started
Aug 11, 2020, 07:00 PM UTC
Resolved
Aug 11, 2020, 07:53 PM UTC
Duration
53m
Detected by Pingoru
Aug 11, 2020, 07:00 PM UTC

Affected components

Strigo serviceAWS EC2 (eu-west-1)

Update timeline

  1. investigating Aug 11, 2020, 07:00 PM UTC

    We are currently investigating this issue.

  2. investigating Aug 11, 2020, 07:00 PM UTC

    We are currently investigating.

  3. monitoring Aug 11, 2020, 07:23 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. monitoring Aug 11, 2020, 07:29 PM UTC

    We've experienced an outage in our Lab Relay Service. It seems like something triggered a massive reconnection cycle from our relay to all of the labs, resulting in high CPU across the board, thus preventing successful reconnects. We've added more resources, and it seems to have settled down. We don't yet know what the root cause is, but we'll keep exploring.

  5. resolved Aug 11, 2020, 07:53 PM UTC

    This incident has been resolved.