Vitally incident

Elevated Error Rates in EU

Major Resolved View vendor source →

Vitally experienced a major incident on February 13, 2025 affecting Product and Analytics Data Processing, lasting 4h 30m. The incident has been resolved; the full update timeline is below.

Started
Feb 13, 2025, 11:20 PM UTC
Resolved
Feb 14, 2025, 03:51 AM UTC
Duration
4h 30m
Detected by Pingoru
Feb 13, 2025, 11:20 PM UTC

Affected components

ProductAnalytics Data Processing

Update timeline

  1. investigating Feb 13, 2025, 11:20 PM UTC

    We are seeing elevated error rates in requests and general operations in our EU data center. This appears to be an issue internal to AWS but we are monitoring

  2. identified Feb 13, 2025, 11:40 PM UTC

    AWS has acknowledged their outage https://health.aws.amazon.com/health/status Operational issue - Multiple services (Stockholm) Service Multiple services Severity Impacted RSS Increased Error Rates and Latencies Feb 13 3:35 PM PST We are investigating increased error rates and latencies for AWS Services in the EU-NORTH-1 Region.

  3. identified Feb 13, 2025, 11:58 PM UTC

    AWS has provided an update ~2 minutes ago. We can confirm that the networking issues within the eu-north-1 region persist and affect connectivity to the Vitally web app as well as Vitally's ability to receive analytics data via our APIs Feb 13 3:55 PM PST We can confirm increased error rates and latencies for multiple AWS Services in the EU-NORTH-1 Region. This is due to a networking issue that we are actively working to mitigate as quickly as possible, and have all engineers engaged at this time. This issue is not impacting connectivity to and from the region, but is impacting inter-region traffic. During this time, the bulk of the impact will be contained to eun1-az3, but some individual services and operations may be impacted in other zones in the EU-NORTH-1 Region. If possible, we recommend weighting traffic away from AZ eun1-az3. We will provide an additional update within the next 45 minutes, or sooner if we have additional information to share.

  4. identified Feb 14, 2025, 12:32 AM UTC

    AWS has just provided an update, though we are still seeing an elevated error rate. Thank you for your patience as our infrastructure provider addresses their issue Feb 13 4:31 PM PST We are seeing early signs of recovery for error rates and latencies for multiple AWS Services in the EU-NORTH-1 Region. As we work toward full recovery, some requests may continue to timeout or be throttled. We recommend customers retry failed requests where possible. We will continue to provide additional information as we have it, or within the next 60 minutes.

  5. resolved Feb 14, 2025, 03:51 AM UTC

    We have seen error rates return to our baseline levels. AWS's most recent update indicates successful mitigation. We will continue to monitor but are not expecting impact to business hours operations in our EU data center.