12Twenty incident
Web application experiencing intermittent memory related issues
12Twenty experienced a minor incident on January 1, 2021 affecting Web Application, lasting 2d 13h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jan 02, 2021, 08:01 AM UTC
An issue is causing memory strain on the web application servers, leading to intermittent failed requests. This issue was likely caused by a code deploy late night on 12/31. We are currently investigating the issue.
- monitoring Jan 02, 2021, 03:48 PM UTC
A fix has been implemented and we are monitoring the results
- resolved Jan 04, 2021, 05:32 AM UTC
The fix that was implemented late night 1/1 resolved the issue. There was a recent code change that caused an issue where web server worker threads were not properly released. Once a certain load for a particular process was reached, the web server began denying new requests, which resulted in intermittent failed requests for various users. This issue was not immediately apparent in our non-prod environments because of the specific type of request and load required to cause the issue. The alerting that did occur was misdiagnosed and subsequently ignored. To ensure this issue does not occur in the future, we will be improving our load-based automated testing and internal release processes.