InEvent incident

Database slowness

Major Resolved View vendor source →

InEvent experienced a major incident on August 26, 2020 affecting API, lasting 6h 27m. The incident has been resolved; the full update timeline is below.

Started
Aug 26, 2020, 12:05 PM UTC
Resolved
Aug 26, 2020, 06:33 PM UTC
Duration
6h 27m
Detected by Pingoru
Aug 26, 2020, 12:05 PM UTC

Affected components

API

Update timeline

  1. investigating Aug 26, 2020, 12:40 PM UTC

    We are currently investigating this issue.

  2. investigating Aug 26, 2020, 12:43 PM UTC

    We are continuing to investigate this issue.

  3. monitoring Aug 26, 2020, 02:59 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Aug 26, 2020, 06:33 PM UTC

    This incident has been resolved.

  5. postmortem Aug 26, 2020, 07:15 PM UTC

    Our servers use an asynchronous event-driven approach, rather than threads, to handle requests. With the modular event-driven architecture, we can have a bucket overflow that outputs the requests as soon as they arrive. Once these requests arrive, we perform same safe-checking and then redirect them to the appropriate logical application. This can be done in multiple ways. What we realized during today operations was that the servers were not able to pipe these many requests as quickly as possible to the application layer. The socket was not able to quickly send the requests to the database, process the output and return to the user. To fix this issue we are improving our load balancer, changing our socket configurations and also changing the database processing mechanism, so the queries can be processed without any additional allocation overload.