SCORM Cloud incident

Elevated Request Latency

Major Resolved View vendor source →

SCORM Cloud experienced a major incident on June 9, 2020 affecting SCORM Cloud Website and SCORM Cloud API, lasting 3h 16m. The incident has been resolved; the full update timeline is below.

Started
Jun 09, 2020, 02:57 PM UTC
Resolved
Jun 09, 2020, 06:14 PM UTC
Duration
3h 16m
Detected by Pingoru
Jun 09, 2020, 02:57 PM UTC

Affected components

SCORM Cloud WebsiteSCORM Cloud API

Update timeline

  1. investigating Jun 09, 2020, 02:57 PM UTC

    Engineers are investigating unexpected high request latency.

  2. identified Jun 09, 2020, 03:19 PM UTC

    The issue has been identified and a fix is being implemented. In the meantime, we have a workaround and we're monitoring the system.

  3. resolved Jun 09, 2020, 06:14 PM UTC

    A fix for this incident has been deployed. We will attach a postmortem with more information to our status page later.

  4. postmortem Jun 10, 2020, 06:11 PM UTC

    First, we apologize for the unexpected service issues on June 9, 2020. Between 2:36 PM UTC and 2:45 PM UTC, SCORM Cloud experienced elevated response latency. Customers may have encountered request timeouts during this time. Response times were especially high at approximately 2:38 PM UTC. This service degradation occurred because of an inefficient database query in an API call. The query was only invoked by the API for customers on our Trial \(free\) plan. The query execution time was proportional to an account's historical registration count. On the 9th of June, a large customer account was downgraded to Trial status for billing reasons. After the downgrade, their API traffic continued as normal for the size of their account. The inefficient query overwhelmed backend database resources. After identifying the issue, we implemented temporary mitigation measures. At the same time, we rewrote the inefficient query. The new implementation executes in a constant amount of time, regardless of account size. At 6:12 PM UTC we deployed the new query, and resolved the issue. This specific problem will not reoccur. Our code review process should prevent the addition of any unbounded queries included in an API method. Additionally, we will review older parts of our code base for similar problems. Again, we apologize for the inconvenience.