12Twenty incident

Web application experiencing intermittent memory related issues

Minor Resolved View vendor source →

12Twenty experienced a minor incident on January 1, 2021 affecting Web Application, lasting 2d 13h. The incident has been resolved; the full update timeline is below.

Started
Jan 01, 2021, 04:00 PM UTC
Resolved
Jan 04, 2021, 05:32 AM UTC
Duration
2d 13h
Detected by Pingoru
Jan 01, 2021, 04:00 PM UTC

Affected components

Web Application

Update timeline

  1. investigating Jan 02, 2021, 08:01 AM UTC

    An issue is causing memory strain on the web application servers, leading to intermittent failed requests. This issue was likely caused by a code deploy late night on 12/31. We are currently investigating the issue.

  2. monitoring Jan 02, 2021, 03:48 PM UTC

    A fix has been implemented and we are monitoring the results

  3. resolved Jan 04, 2021, 05:32 AM UTC

    The fix that was implemented late night 1/1 resolved the issue. There was a recent code change that caused an issue where web server worker threads were not properly released. Once a certain load for a particular process was reached, the web server began denying new requests, which resulted in intermittent failed requests for various users. This issue was not immediately apparent in our non-prod environments because of the specific type of request and load required to cause the issue. The alerting that did occur was misdiagnosed and subsequently ignored. To ensure this issue does not occur in the future, we will be improving our load-based automated testing and internal release processes.