Healthise experienced a major incident on December 14, 2022 affecting Coach and Content Browser and 1 more component, lasting 2h 5m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Dec 14, 2022, 07:58 PM UTC
Healthwise-hosted solutions are experiencing performance issues. Our Network Administrators and Engineers are working to fix the problem. We will post updates as we learn more.
- monitoring Dec 14, 2022, 09:02 PM UTC
We have identified the issue and implemented steps to mitigate the problem. Things are returning to normal and we will continue to monitor the situation. We will post an update when things are resolved.
- resolved Dec 14, 2022, 10:03 PM UTC
All performance issues have been resolved. We will post a root cause analysis once we have completed our full investigation. If the investigation has not been completed within 1 week we will post an interim RCA with the information that we currently have available.
- postmortem Dec 20, 2022, 11:35 PM UTC
## Introduction The purpose of this Root Cause Analysis \(RCA\) is to determine the causes that contributed to the performance issues of the Healthwise-hosted solutions on December 14, 2022. ## Event Description At 12:23 PM MST on Wednesday, December 14, 2022, Healthwise administrators were alerted to intermittent performance degradation to Healthwise-hosted applications. Healthwise found the search index was overwhelmed due to heavy network traffic. At 1:38 PM MST Healthwise was able to restore service by reducing the requests and rebuilding the index. Total time of the incident was 1 hours and 15 minutes; however, degradation was intermittent during that time. ## Findings and Root Cause Based on the investigation conducted, the team determined the following findings regarding this event: The rate limiting solution was allowing too much network traffic to backend services that couldn’t handle the load. Infrastructure engineers were able to mitigate the service degradation by isolating the disruptive traffic which stabilized the environment. ## Corrective Action Healthwise has adjusted its rate limiting solution to lower the number of permitted requests and is committed to improving how periods of heavy network traffic are handled.