12Twenty incident

Background Job Processing Delays

Notice Resolved View vendor source →

12Twenty experienced a notice incident on October 15, 2020, lasting —. The incident has been resolved; the full update timeline is below.

Started
Oct 15, 2020, 09:03 PM UTC
Resolved
Oct 13, 2020, 04:00 PM UTC
Duration
Detected by Pingoru
Oct 15, 2020, 09:03 PM UTC

Update timeline

  1. resolved Oct 15, 2020, 09:03 PM UTC

    Background jobs processed through the 12Twenty platform are experiencing delays. Background jobs include the sending of emails (including email verification codes), document packet generation, data uploads, etc. As an example, when generating document packets, the user will see a ‘please wait' until the background job infrastructure is able to finish processing the jobs queued before the user's job.

  2. postmortem Oct 15, 2020, 09:03 PM UTC

    ## **Summary of Impact** On 2020-10-13, between 9:00am PT and 1:00pm PT, background jobs processed through 12Twenty's platform were delayed. Background jobs include the sending of emails \(including email verification codes\), document packet generation, data uploads, etc. As an example, when generating document packets, the user saw a ‘please wait' until our background job infrastructure was able to finish processing the jobs queued before the user’s job. ## **Root Cause** Similar to the experienced on 9/30: A number of personalized mass emails with large attachments were sent by numerous schools within a short timeframe through our platform. A few intermediary patches were deployed in the last two weeks, but they were not sufficient to fully address the problem. ## **Mitigation** The background job infrastructure self-recovered after about 4 hours of delays. ## **Next Steps** We know how frustrating these issues can be and we apologize for the delay in procuring an adequate resolution. A patch to specifically address the issue mentioned above was deployed on 10/13 - mass emails will no longer affect the processing of other background jobs like document packet generation and verification code emails. The below highlights some of the additional steps we are taking to ensure this type of incident does not occur in the future: * Further improve the prioritization of background job types: expecting to deploy within two weeks * Faster scale out of background job infrastructure: expecting to have a preliminary solution within the next week and a longer term solution is also in progress. * Work with our email sending provider to look into the excessive performance degradation when emails with large attachments are sent: still in progress * Improved monitoring: expecting to deploy within the next week.