SchemeServe incident

Errors when updating existing cases

Minor Resolved View vendor source →

SchemeServe experienced a minor incident on March 21, 2024 affecting 🎩 SchemeServe, lasting 24m. The incident has been resolved; the full update timeline is below.

Started
Mar 21, 2024, 05:46 PM UTC
Resolved
Mar 21, 2024, 06:11 PM UTC
Duration
24m
Detected by Pingoru
Mar 21, 2024, 05:46 PM UTC

Affected components

🎩 SchemeServe

Update timeline

  1. investigating Mar 21, 2024, 05:46 PM UTC

    We are aware of an issue that may result in some users experiencing errors when saving changes to existing cases - we are investigating as a matter of urgency.

  2. resolved Mar 21, 2024, 06:11 PM UTC

    We have identified and fixed the issue, a post-mortem will follow.

  3. postmortem Mar 22, 2024, 11:11 AM UTC

    For some time SchemeServe has been approaching the limit of 32 bit integers for IDs in its main answers storage. Some time ago, we successfully implemented a zero downtime change to the main storage that enabled us to store 64 bit integers - and this was done in advance of actually requiring the extra space. Yesterday evening however, the IDs crossed the 32 bit threshold and unfortunately highlighted an instance of archive storage that had not been converted from 32 to 64 bit integers. This meant that the operations that caused the usage of the archive storage failed. Due to the diverse and varying workloads that SchemeServe processes, our tests did not catch this beforehand. We quickly realised the issue and implemented the storage change to the archive.