KnowledgeOwl experienced a major incident on November 6, 2023 affecting Knowledge Bases and Web Application and 1 more component, lasting 37m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Nov 06, 2023, 06:31 PM UTC
We just started seeing reports and warnings that KnowledgeOwl is down. Our team is focusing on getting it sorted and back online ASAP.
- investigating Nov 06, 2023, 06:36 PM UTC
We are continuing to investigate this issue.
- monitoring Nov 06, 2023, 06:41 PM UTC
We seem to be back to normal operations, but we're continuing to monitor performance and finish our investigation of the initial root cause.
- resolved Nov 06, 2023, 07:08 PM UTC
Our fix seems to have fully resolved the issue; we'll be issuing a postmortem shortly to explain root cause in more detail. Thank you all for your patience today!
- postmortem Nov 06, 2023, 07:09 PM UTC
One of our servers went into a failing state and began issuing a vast amount of internal connection requests. Those requests overloaded our network. Rebooting the server solved the issue, and internal traffic patterns have returned to normal. ## Next steps We are reviewing our alarms on the affected server to see if we could have detected the failure sooner and prevented the downtime.