Upstash incident

Degraded Performance

Upstash experienced a critical incident on June 11, 2025 affecting N. Virginia, USA (us-east-1) and N. California, USA (us-west-1) and 1 more component, lasting 3h 10m. The incident has been resolved; the full update timeline is below.

Started: Jun 11, 2025, 06:51 AM UTC
Resolved: Jun 11, 2025, 10:02 AM UTC
Duration: 3h 10m
Detected by Pingoru: Jun 11, 2025, 06:51 AM UTC

Affected components

N. Virginia, USA (us-east-1)N. California, USA (us-west-1)Oregon, USA (us-west-2)Frankfurt, Germany (eu-central-1)Ireland (eu-west-1)Singapore (ap-southeast-1)Sydney, Australia (ap-southeast-2)Mumbai, India (ap-south-1)Tokyo, Japan (ap-northeast-1)São Paulo, Brazil (sa-east-1)

Update timeline

investigating Jun 11, 2025, 06:51 AM UTC

We are currently investigating this issue.
investigating Jun 11, 2025, 08:24 AM UTC

We are continuing to investigate this issue.
monitoring Jun 11, 2025, 09:20 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Jun 11, 2025, 10:02 AM UTC

This incident has been resolved.
postmortem Jun 11, 2025, 11:49 AM UTC

A routine system maintenance operation at the OS level led to the application of system updates across multiple EC2 instances in our clusters in several AWS regions. These updates included changes to networking components, which inadvertently triggered restarts. As a result, several EC2 nodes failed health checks and temporarily dropped out of the cluster, disrupting high availability and causing partial connectivity issues for some clients and operations. We have since reproduced the issue in a controlled environment and verified the root cause. To prevent a recurrence, we are updating our node maintenance strategy to ensure greater control over the timing and impact of system-level changes and excluding networking components from automated upgrades.