Cypress incident

Cypress Cloud Outage

Cypress experienced a critical incident on October 20, 2025 affecting Cloud and Authentication and 1 more component, lasting 11h 54m. The incident has been resolved; the full update timeline is below.

Started: Oct 20, 2025, 09:46 AM UTC
Resolved: Oct 20, 2025, 09:40 PM UTC
Duration: 11h 54m
Detected by Pingoru: Oct 20, 2025, 09:46 AM UTC

Affected components

CloudAuthenticationBillingIntegrationsAnalyticsAccessibilityUI Coverage

Update timeline

investigating Oct 20, 2025, 07:50 AM UTC

We are currently investigating this issue.
investigating Oct 20, 2025, 09:46 AM UTC

AWS Operational Issue affecting multiple services in N. Virginia (https://health.aws.amazon.com/health/status?path=open-issues) identified as root cause
investigating Oct 20, 2025, 10:37 AM UTC

AWS services are incrementally recovering; but compute services are still experiencing issues & capacity remains constrained, impacting service restoration.
identified Oct 20, 2025, 10:38 AM UTC

Flagging incident as identified; recovery still in progress with resource constraints impacting services restoration
identified Oct 20, 2025, 01:02 PM UTC

AWS Services are experiencing network connectivity issues in the US-EAST-1 region. While there are early signs of recovery, multiple Cypress Cloud services continue to experience outages and we are actively pursuing parallel approaches to restore capacity.
identified Oct 20, 2025, 02:55 PM UTC

We have confirmed Test Recordings to Cypress Cloud are operating normally. UI Coverage and Accessibility processing for recorded runs are operational, but experiencing elevated latencies due to limited compute capacity. Dashboard continues to experience a major outage and we are actively working on resolving the same.
identified Oct 20, 2025, 03:50 PM UTC

The ongoing AWS outage in US-EAST-1 region have impacted several services that we rely on, especially around message routing and cluster provisioning. We continue to see an outage with our Dashboard. Application quality services are operational, but experiencing significant processing delays Services that handle recordings and other critical areas of Cypress Cloud are operating normally. The team continues to attempt mitigation strategies as we work towards recovering remaining services as quickly as we can.
identified Oct 20, 2025, 03:52 PM UTC

We are continuing to work on a fix for this issue.
identified Oct 20, 2025, 04:14 PM UTC

We have some initial mitigation in place for the Cypress Cloud UI. We are still experiencing degraded service and underlying outages; we are continuing to take mitigation steps.
identified Oct 20, 2025, 06:31 PM UTC

Some of the key bottlenecks in our recovery process on the AWS side of the infrastructure are beginning to show significant improvement. We have tightened some volumes in order to help with recovery throughput, which results in intermittency with the affected services. In other words, we are still recovering and are not yet at usual activity volume. As more resources are regained, we will be improving throughput into services as well.
identified Oct 20, 2025, 07:43 PM UTC

We are continuing to onramp additional infrastructure; however, we are not yet at our default levels of activity in all places. Upgrading to `monitoring` to communicate the improvement. Note that applicaiton quality services are still significantly lagging other services, but we are ramping up infrastructure there as well. Recording services and analytics are normal UI is at normal levels of activity + supply Billing and Integrations still have some third party dependencies which are recovering Application quality is significantly lagged
monitoring Oct 20, 2025, 08:52 PM UTC

Services have reached normal operational levels. We are still seeing slowness in the same services cited as still-recovering in the broader aws outage (e.g., lambda and compute provisioning). Some Cypress Cloud functions are working through backlogs and remain lagged (they are noted as degrated). Setting incident status to `monitoring`.
resolved Oct 20, 2025, 09:40 PM UTC

We are observing normal operations and performance, and mitigations that put into effect earlier have served their purposes and are now unnecessary and thusly rolled off.