SchemeServe incident

SchemeServe intermittent service

Minor Resolved View vendor source →

SchemeServe experienced a minor incident on August 1, 2022 affecting 🎩 SchemeServe and 🔗 SchemeServe API and 1 more component, lasting 44m. The incident has been resolved; the full update timeline is below.

Started
Aug 01, 2022, 08:43 AM UTC
Resolved
Aug 01, 2022, 09:27 AM UTC
Duration
44m
Detected by Pingoru
Aug 01, 2022, 08:43 AM UTC

Affected components

🎩 SchemeServe🔗 SchemeServe API🔗 Legacy API

Update timeline

  1. investigating Aug 01, 2022, 08:43 AM UTC

    We are currently investigating reports of intermittent service across SchemeServe this morning. We will update as soon as we have more information

  2. resolved Aug 01, 2022, 09:27 AM UTC

    We have identified the root cause of the issue and have applied remedial action. We have confirmation that this has been resolved. We thank you for your patience and appreciate your continued partnership.

  3. postmortem Aug 01, 2022, 09:27 AM UTC

    SchemeServe employs auto-scaling across its application servers to accommodate more traffic load as required. As the server nodes scaled up for Monday morning one of the nodes did not build correctly meaning the application pods were unable to deploy to it correctly. This then caused a knock-on effect where the pods were caught in a loop of a continual startup. The architecture is designed to cope with issues such as this and although given time it would have sorted itself this should not have caused gateway timeout issues in the way it did. We will be adding additional remediation measures to ensure that this does not happen again.