BenchPrep incident

Degraded Performance

Minor Resolved View vendor source →

BenchPrep experienced a minor incident on August 23, 2022 affecting API, lasting 7h 30m. The incident has been resolved; the full update timeline is below.

Started
Aug 23, 2022, 06:13 PM UTC
Resolved
Aug 24, 2022, 01:44 AM UTC
Duration
7h 30m
Detected by Pingoru
Aug 23, 2022, 06:13 PM UTC

Affected components

API

Update timeline

  1. investigating Aug 23, 2022, 06:13 PM UTC

    We are experiencing an issue with high load times within our non-learner BenchPrep applications. Learner applications are not affected. We are currently provisioning more resources and will continue to investigate.

  2. investigating Aug 23, 2022, 06:17 PM UTC

    We are conducting a short restart of services which can cause non-learner applications to be unavailable for a few minutes.

  3. investigating Aug 23, 2022, 06:35 PM UTC

    The restart did not alleviate the issues. We are restoring the services and rolling back the changes.

  4. investigating Aug 23, 2022, 06:41 PM UTC

    The rollback was completed and services have been restored. We are continuing to investigate this issue.

  5. monitoring Aug 23, 2022, 10:26 PM UTC

    We have confirmed performance has stabilized and we continue monitoring affected applications. We are planning to introduce significant improvements to our progress cluster during the next database maintenance.

  6. resolved Aug 24, 2022, 01:44 AM UTC

    We found regression in api reporting request that was causing significant amount of memory consumption impacting the nodes and entire system. The fix was been deployed and verified.