Scalyr incident

app.scalyr.com inaccessible

Critical Resolved View vendor source →

Scalyr experienced a critical incident on March 14, 2023 affecting Main Site, lasting 5h. The incident has been resolved; the full update timeline is below.

Started
Mar 14, 2023, 10:58 AM UTC
Resolved
Mar 14, 2023, 03:59 PM UTC
Duration
5h
Detected by Pingoru
Mar 14, 2023, 10:58 AM UTC

Affected components

Main Site

Update timeline

  1. investigating Mar 14, 2023, 10:58 AM UTC

    The US cluster is inaccessible. We are currently investigating

  2. monitoring Mar 14, 2023, 12:32 PM UTC

    A fix has been implemented and we are monitoring the results - the site is now accessible but initially may be slow to respond

  3. identified Mar 14, 2023, 12:42 PM UTC

    The system is still unstable. Access is intermittent, and we are continuing to work on addressing the issue.

  4. identified Mar 14, 2023, 02:41 PM UTC

    The underlying issue has been addressed. We are continuing to scale out in order to fully recover.

  5. monitoring Mar 14, 2023, 03:43 PM UTC

    UI and ingestion have generally recovered and we are continuing to monitor. The system is currently catching up on the ingestion backlog.

  6. resolved Mar 14, 2023, 03:59 PM UTC

    This incident has been resolved.