Continu incident

Service Degraded (AWS)

Major Resolved View vendor source →

Continu experienced a major incident on December 15, 2021 affecting AWS ec2-us-west-2 and Learn Web API - US and 1 more component, lasting 42m. The incident has been resolved; the full update timeline is below.

Started
Dec 15, 2021, 03:48 PM UTC
Resolved
Dec 15, 2021, 04:31 PM UTC
Duration
42m
Detected by Pingoru
Dec 15, 2021, 03:48 PM UTC

Affected components

AWS ec2-us-west-2Learn Web API - USProgrammatic API - USLearn Web App - US

Update timeline

  1. investigating Dec 15, 2021, 03:48 PM UTC

    Continu is currently experiencing degraded service performance. At this time it is believed that we're again seeing the start of an Amazon AWS provider outage. Our Engineering team is working on troubleshooting these issues and monitoring the output coming from AWS (One of our primary service providers). We will keep this Statuspage up to date, thank you for your patience.

  2. investigating Dec 15, 2021, 03:49 PM UTC

    We are continuing to investigate this issue.

  3. investigating Dec 15, 2021, 03:52 PM UTC

    We are continuing to investigate this issue.

  4. identified Dec 15, 2021, 04:00 PM UTC

    AWS has identified internet connectivity issues within the regions responsible for serving the Continu application. They are currently working to resolve these issues. Our Engineering team is continuing to monitor this from our end to provide updates. Thank you for your patience.

  5. resolved Dec 15, 2021, 04:31 PM UTC

    AWS has resolved their internet connectivity issues and we're seeing service return to normal. Our Engineering team will continue to monitor the recovery. Once again, thank you for your patience.

  6. postmortem Dec 15, 2021, 04:31 PM UTC

    # Service outage \(12/15/2021\) ‌ At approximately 10:40 AM EST Continu experienced a general service outage in the US-West-2 region. This outage resulted in customer instances becoming unreachable. Below is the chain of events that led to this outage and it’s resolution. ‌ * **10:40 AM EST** - The Continu Engineering team began receiving alerts that customer instances in the US-West-2 region began to show signs of degraded performance. The team spun a call up to begin troubleshooting and identified that all instances hosted in this region were slow to load or unreachable. * **10:45 AM EST** - During troubleshooting it became apparent that multiple other tools were unresponsive which is indicative of a much larger source issue unrelated directly to Continu. Reports of other services and sites hosted within out primary provider \(AWS\) becoming unusable began to surface. This coincided with an update to the AWS Global Service Dashboard that they were investigating internet connectivity issues. * **10:56 AM EST** - Continu customer instances begin to recover, service is restored to normal. * **11:01 AM EST** - AWS reports they have identified the root cause of their datacenter internet connectivity issues and begins to see early signs of recovery. * **11:14 AM EST** - AWS reports that their connectivity issues have been resolved and closes the incident. Continu concurs but continues to monitor recovery. The Statuspage incident is closed shortly thereafter. We at Continu thank you for your patience during this outage.