KnowledgeOwl incident

Outage detected

Major Resolved View vendor source →

KnowledgeOwl experienced a major incident on November 6, 2023 affecting Knowledge Bases and Web Application and 1 more component, lasting 37m. The incident has been resolved; the full update timeline is below.

Started
Nov 06, 2023, 06:31 PM UTC
Resolved
Nov 06, 2023, 07:08 PM UTC
Duration
37m
Detected by Pingoru
Nov 06, 2023, 06:31 PM UTC

Affected components

Knowledge BasesWeb ApplicationAPI

Update timeline

  1. investigating Nov 06, 2023, 06:31 PM UTC

    We just started seeing reports and warnings that KnowledgeOwl is down. Our team is focusing on getting it sorted and back online ASAP.

  2. investigating Nov 06, 2023, 06:36 PM UTC

    We are continuing to investigate this issue.

  3. monitoring Nov 06, 2023, 06:41 PM UTC

    We seem to be back to normal operations, but we're continuing to monitor performance and finish our investigation of the initial root cause.

  4. resolved Nov 06, 2023, 07:08 PM UTC

    Our fix seems to have fully resolved the issue; we'll be issuing a postmortem shortly to explain root cause in more detail. Thank you all for your patience today!

  5. postmortem Nov 06, 2023, 07:09 PM UTC

    One of our servers went into a failing state and began issuing a vast amount of internal connection requests. Those requests overloaded our network. Rebooting the server solved the issue, and internal traffic patterns have returned to normal. ## Next steps We are reviewing our alarms on the affected server to see if we could have detected the failure sooner and prevented the downtime.