Healthise incident

Performance Issues

Minor Resolved View vendor source →

Healthise experienced a minor incident on September 7, 2022 affecting Coach and Content Browser and 1 more component, lasting 7h 2m. The incident has been resolved; the full update timeline is below.

Started
Sep 07, 2022, 04:56 PM UTC
Resolved
Sep 07, 2022, 11:58 PM UTC
Duration
7h 2m
Detected by Pingoru
Sep 07, 2022, 04:56 PM UTC

Affected components

CoachContent BrowserCustom Content Manager (CCM)EMR ModulesKnowledgebase

Update timeline

  1. investigating Sep 07, 2022, 04:56 PM UTC

    Healthwise-hosted solutions are experiencing performance issues. Our Network Administrators and Engineers are working to fix the problem. We will post updates as we learn more.

  2. investigating Sep 07, 2022, 05:03 PM UTC

    We are continuing to investigate this issue.

  3. investigating Sep 07, 2022, 05:04 PM UTC

    We are continuing to investigate this issue.

  4. investigating Sep 07, 2022, 05:07 PM UTC

    An upstream issue is causing an increase in latency to Healthwise Advise. All other services are nominal. We are continuing to monitor the situation.

  5. investigating Sep 07, 2022, 07:43 PM UTC

    We are continuing to investigate the issue.

  6. monitoring Sep 07, 2022, 10:54 PM UTC

    We have implemented a temporary fix and services are responding more normally. We will continue to monitor the situation as we work on a permanent solution. We will post an update when that is completed.

  7. resolved Sep 07, 2022, 11:58 PM UTC

    All performance issues have been resolved. We will post a root cause analysis once we have completed our full investigation. If the investigation has not been completed within 1 week we will post an interim RCA with the information that we currently have available.

  8. postmortem Sep 15, 2022, 10:20 PM UTC

    # **Introduction** The purpose of this Root Cause Analysis \(RCA\) is to determine the causes that contributed to the performance issues of the Healthwise-hosted solutions on September 07-08, 2022. # **Event Description** At 10:21 AM MST on Wednesday, September 7th, 2022, Healthwise administrators were alerted to performance degradation to Healthwise-hosted applications. Healthwise found that a third party multi-regional load balancer was experiencing issues that caused intermittent availability until 3:19 PM MST. Total time of service degradation was 4 hours and 59 minutes. At 11:21 AM MST on Wednesday, September 7th, 2022, Healthwise administrators were alerted to Healthwise-hosted product downtime. Investigation found that the Healthwise search index was in a bad state and was rebuilt to restore service. At 1:21 PM MST, Healthwise administrators were alerted to additional performance degradation due to system memory resources being overwhelmed. At 3:13 PM MST, Healthwise administrators restarted services and allocated additional resources to the servers which temporarily restored product availability. The total time of service degradation was 3 hours and 52 minutes. At 8:39 AM MST on Thursday, September 8th, 2022, Healthwise administrators were alerted to renewed performance degradation. Investigation found that the Healthwise search index was reporting internal server errors. In-depth analysis of the Healthwise search solution found that a field was missing in a configuration file. At 2:30 PM MST, a fix was deployed to production to resolve the issue. The total time of service disruption was 5 hours and 51 minutes. # # **Findings and Root Cause** Based on the investigation conducted, the team determined the following findings regarding this event: Two issues were happening at the same time which slowed down troubleshooting efforts. The third party multi-regional load balancer intermittent availability issue made identifying the search configuration issue more difficult due to monitoring dependencies. Additionally, steps to mitigate service degradation successfully stabilized network availability until high network volume caused a second incident and prompted further investigation. This investigation found that a new version of the search solution would error when missing a configuration field. # **Corrective Action** Healthwise administrators monitored updates from the third party multi-regional load balancer provider until service was restored. They also applied temporary fixes by rebuilding the search index, allocating additional resources, and throttling high bandwidth activity. A long-term fix to properly configure the Healthwise search solution was deployed on 9/8/2022.