Upstash Outage History

Upstash had 32 outages in the last 2 years totaling 62h 31m of downtime — averaging 1.3 incidents per month.

There were 32 Upstash outages since June 3, 2024 totaling 62h 31m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.upstash.com

Major May 14, 2026

Upstash Redis – intermittent connection issues in some regions

Detected by Pingoru: May 14, 2026, 03:47 PM UTC
Resolved: May 14, 2026, 05:31 PM UTC
Duration: 1h 44m

Affected: N. Virginia, USA (us-east-1)N. California, USA (us-west-1)Sydney, Australia (ap-southeast-2)Mumbai, India (ap-south-1)

Timeline · 4 updates

monitoring May 14, 2026, 03:47 PM UTC

We identified intermittent connection issues affecting some regions. Fix is being deployed. We'll update the list of affected regions shortly.
identified May 14, 2026, 03:49 PM UTC

The issue has been identified and a fix is being implemented.
monitoring May 14, 2026, 04:00 PM UTC

A fix has been implemented and we are monitoring.
resolved May 14, 2026, 05:31 PM UTC

Earlier today, unexpected load on our proxies caused intermittent connection issues for Upstash Redis in the following regions: us-east-1, us-west-1, ap-southeast-2, and ap-south-1. During this period, some clients may have seen connection timeouts or elevated error rates when reaching their databases. Our team identified the issue quickly and applied workarounds to relieve pressure on the affected proxies. Connection health has since been restored and we've been monitoring the regions to confirm everything is stable. All systems are now operating normally. We appreciate your patience and apologize for any disruption this may have caused.

Read the full incident report →

Critical May 12, 2026

Fly.io Upstash Redis Service Distruption

Detected by Pingoru: May 12, 2026, 09:49 AM UTC
Resolved: May 12, 2026, 01:48 PM UTC
Duration: 3h 59m

Timeline · 5 updates

investigating May 12, 2026, 09:49 AM UTC

Some regions are experiencing connectivity issues due to an ongoing network problem. We are currently investigating
identified May 12, 2026, 10:50 AM UTC

The issue has been identified and the fix is being implemented.
monitoring May 12, 2026, 12:05 PM UTC

A fix has been implemented and we are monitoring the results.
resolved May 12, 2026, 01:48 PM UTC

This incident has been resolved.
postmortem May 15, 2026, 12:43 PM UTC

On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.

Read the full incident report →

Major May 11, 2026

Fly.io Upstash Redis service distruption (FRA region)

Detected by Pingoru: May 11, 2026, 03:05 PM UTC
Resolved: May 11, 2026, 06:22 PM UTC
Duration: 3h 16m

Timeline · 5 updates

investigating May 11, 2026, 03:05 PM UTC

Some databases may experience increased latency or timeouts in Fly.io’s FRA region.
investigating May 11, 2026, 03:07 PM UTC

We are continuing to investigate the issue.
investigating May 11, 2026, 05:19 PM UTC

We are working with Fly team to investigate the root cause.
resolved May 11, 2026, 06:22 PM UTC

The incident has been resolved. We are working with Fly team on RCA.
postmortem May 15, 2026, 12:49 PM UTC

On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.

Read the full incident report →

Critical May 8, 2026

QStash US Region Service Disruption

Detected by Pingoru: May 08, 2026, 09:46 AM UTC
Resolved: May 08, 2026, 10:08 AM UTC
Duration: 22m

Affected: US-EAST-1

Timeline · 5 updates

Read the full incident report →

Minor May 5, 2026

QStash US Region: Schedule Degradation

Detected by Pingoru: May 05, 2026, 02:24 PM UTC
Resolved: May 06, 2026, 08:05 AM UTC
Duration: 17h 40m

Affected: US-EAST-1

Timeline · 6 updates

Read the full incident report →

Major April 27, 2026

QStash us-east-1 Schedules

Detected by Pingoru: Apr 27, 2026, 04:09 PM UTC
Resolved: Apr 27, 2026, 07:13 PM UTC
Duration: 3h 4m

Affected: US-EAST-1

Timeline · 3 updates

identified Apr 27, 2026, 04:09 PM UTC

Some users on QStash US might be experiencing schedule issues.
monitoring Apr 27, 2026, 07:04 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Apr 27, 2026, 07:13 PM UTC

This incident has been resolved.

Read the full incident report →

Minor April 2, 2026

Fly.io Upstash Redis – iad Region Elevated Latency and Temporary Read-Only State

Detected by Pingoru: Apr 02, 2026, 06:45 PM UTC
Resolved: Apr 02, 2026, 07:53 PM UTC
Duration: 1h 8m

Timeline · 2 updates

identified Apr 02, 2026, 06:45 PM UTC

Servers in the iad region experienced unexpected disk load, resulting in elevated latencies and a temporary read-only state. We are migrating replicas to new instances to mitigate the issue and expect to have it fully resolved shortly.
resolved Apr 02, 2026, 07:53 PM UTC

Replication complete, incident resolved.

Read the full incident report →

Notice March 27, 2026

Upstash Redis: GCP Global Connectivity Problems

Detected by Pingoru: Mar 27, 2026, 05:05 PM UTC
Resolved: Mar 27, 2026, 03:30 PM UTC
Duration: —

Timeline · 1 update

resolved Mar 27, 2026, 05:05 PM UTC

Due to a race condition in a process that attaches static IPs to nodes, some of the IPs in the dns were detached from the nodes, causing timeouts.

Read the full incident report →

Critical March 6, 2026

Region ap-northeast-1 outage on Upstash Global

Detected by Pingoru: Mar 06, 2026, 01:50 PM UTC
Resolved: Mar 06, 2026, 02:49 PM UTC
Duration: 59m

Affected: Tokyo, Japan (ap-northeast-1)

Timeline · 4 updates

investigating Mar 06, 2026, 01:50 PM UTC

We are currently investigating this issue.
monitoring Mar 06, 2026, 02:08 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Mar 06, 2026, 02:49 PM UTC

This incident has been resolved.
postmortem Mar 09, 2026, 10:07 AM UTC

On March 6, between approximately 13:44–14:07 UTC, some databases experienced elevated latency and connection errors in the Tokyo \(ap-northeast-1\) region. The issue was caused by a sudden spike in traffic that significantly increased network utilization and connection load on a subset of nodes. Our team mitigated the incident by scaling up capacity in the region and redistributing load across additional nodes. Service recovered once the additional capacity was brought online. Resolution We have increased the number of machines in the Tokyo region to provide additional headroom and reduce the likelihood of similar incidents during traffic spikes. Next Steps We are continuing to review capacity safeguards and connection-handling limits to improve resilience against sudden traffic surges.

Read the full incident report →

Minor January 23, 2026

Upstash Redis: Intermittent latency on us-east-1

Detected by Pingoru: Jan 23, 2026, 05:05 PM UTC
Resolved: Jan 23, 2026, 03:30 PM UTC
Duration: —

Timeline · 1 update

resolved Jan 23, 2026, 05:05 PM UTC

We identified the cause of elevated latency impacting some databases in us-east-1 region between 15:30–15:35 UTC as a sudden surge of connection attempts that hit OS-level connection limits on our proxy layer. This resulted in slower new connection establishment and increased latency for some requests. Databases were not impacted. We are implementing additional proxy-level metrics and safeguards to detect and manage similar edge cases earlier.

Read the full incident report →

Minor December 12, 2025

QStash - Message delays for Flow Control configurations

Detected by Pingoru: Dec 12, 2025, 02:34 PM UTC
Resolved: Dec 01, 2025, 07:00 AM UTC
Duration: —

Timeline · 1 update

resolved Dec 12, 2025, 02:34 PM UTC

We identified and fixed a bug that could cause messages with Flow Control enabled to be delayed longer than their configured delay, resulting in unexpectedly long pending times. The fix is in place and the issue should not recur. If you’re still seeing unusually long-delayed messages, please contact [email protected] and we can help with remediation.

Read the full incident report →

Major December 10, 2025

Connectivity issue impacting Regional Databases in US-East-1

Detected by Pingoru: Dec 10, 2025, 11:50 AM UTC
Resolved: Dec 10, 2025, 12:08 PM UTC
Duration: 17m

Affected: AWS US-EAST-1

Timeline · 2 updates

investigating Dec 10, 2025, 11:50 AM UTC

We identified an issue in Regional Databases in US-East-1 where database replicase have connectivity issues with each other. Other regions are not impacted. Global databases are not impacted. We are working on the issue.
resolved Dec 10, 2025, 12:08 PM UTC

Issue has been identified and replicas were successfully reconnected.

Read the full incident report →

Critical December 5, 2025

Upstash Console and Context7 Console is Currently Experiencing issues

Detected by Pingoru: Dec 05, 2025, 09:03 AM UTC
Resolved: Dec 05, 2025, 09:18 AM UTC
Duration: 14m

Affected: Upstash ConsoleContext7 Console

Timeline · 2 updates

investigating Dec 05, 2025, 09:03 AM UTC

Upstream provider confirmed an incident. We are investigating the impact and potential resolutions.
resolved Dec 05, 2025, 09:18 AM UTC

This incident has been resolved.

Read the full incident report →

Major October 20, 2025

Upstash Console issues

Detected by Pingoru: Oct 20, 2025, 02:52 PM UTC
Resolved: Oct 20, 2025, 06:50 PM UTC
Duration: 3h 58m

Affected: Upstash Console

Timeline · 4 updates

investigating Oct 20, 2025, 02:52 PM UTC

We are currently experiencing errors in the Upstash Console due to issues with one of our upstream providers. This may affect access to the dashboard and related operations. Our team is actively monitoring the situation and working to mitigate the impact. We will provide updates as soon as more information becomes available.
monitoring Oct 20, 2025, 03:05 PM UTC

Console is back to normal again. We are currently monitoring.
monitoring Oct 20, 2025, 06:07 PM UTC

Developer API (api.upstash.com) is having availability issues alongside Upstash Console. While we are monitoring the underlying cloud provider's status updates, we are also working on a remediation.
resolved Oct 20, 2025, 06:50 PM UTC

A fix has been deployed as a workaround so that our systems are not affected from the ongoing incident of the cloud provider

Read the full incident report →

Critical October 20, 2025

Login Issues on Upstash Console

Detected by Pingoru: Oct 20, 2025, 07:00 AM UTC
Resolved: Oct 20, 2025, 07:00 AM UTC
Duration: —

Timeline · 1 update

resolved Oct 20, 2025, 10:29 AM UTC

As a side effect of an incident on the underlying cloud provider, Upstash Console has had availability issues between 07:00UTC and 09:23UTC. Only Upstash Console is impacted, Upstash products remained operational.

Read the full incident report →

Minor October 10, 2025

Connectivity issue on us-east-1

Detected by Pingoru: Oct 10, 2025, 04:47 PM UTC
Resolved: Oct 10, 2025, 03:46 PM UTC
Duration: —

Timeline · 1 update

resolved Oct 10, 2025, 04:47 PM UTC

Between 15:46–15:55 UTC, some client connection attempts to databases in us-east-1 timed out due to unexpected high load on a server. The node was recovered at 15:50 UTC, and the updated DNS record propagated by 15:55 UTC. Services are operating normally.

Read the full incident report →

Minor September 9, 2025

Temporary Database Routing Issue

Detected by Pingoru: Sep 09, 2025, 03:48 PM UTC
Resolved: Sep 09, 2025, 12:00 PM UTC
Duration: —

Timeline · 1 update

resolved Sep 09, 2025, 03:48 PM UTC

Impact: A subset of clients connecting through the eu-central-1 region experienced increased error rates and timeouts when accessing certain databases. Clients in us-west-2 were also briefly affected. The issue was limited in scope and did not impact other regions. Root Cause: During an ongoing migration to improve database routing reliability, a configuration step was applied inconsistently across regions. Resolution: Our monitoring alerted us within minutes, and the migration was promptly rolled back for the affected regions. Service definitions were restored, and normal database connectivity resumed by 15:08 UTC. Next Steps: We are reviewing our migration process to ensure consistency across all regions and adding additional safeguards to prevent similar issues in the future.

Read the full incident report →

Minor August 18, 2025

Connectivity Issues in us-east-1

Detected by Pingoru: Aug 18, 2025, 12:37 PM UTC
Resolved: Aug 18, 2025, 12:37 PM UTC
Duration: —

Affected: N. Virginia, USA (us-east-1)

Timeline · 2 updates

resolved Aug 18, 2025, 01:30 PM UTC

Between 12:37 UTC and 12:50 UTC, an overload in the connection proxying system resulted in instability for a subset of databases. The root cause was identified as a misconfiguration in the routing rules, which caused certain requests to experience timeouts during the initial phase of a gradual deployment. Upon detection, the deployment was immediately rolled back, restoring normal service. We are reviewing our deployment and configuration validation processes to prevent similar issues in the future.
postmortem Aug 18, 2025, 01:30 PM UTC

Between 12:37 UTC and 12:50 UTC, an overload in the connection proxying system resulted in instability for a subset of databases. The root cause was identified as a misconfiguration in the routing rules, which caused certain requests to experience timeouts during the initial phase of a gradual deployment. Upon detection, the deployment was immediately rolled back, restoring normal service. We are reviewing our deployment and configuration validation processes to prevent similar issues in the future.

Read the full incident report →

Major June 26, 2025

Login problems on Upstash Console

Detected by Pingoru: Jun 26, 2025, 06:53 AM UTC
Resolved: Jun 26, 2025, 07:34 AM UTC
Duration: 41m

Affected: Upstash Console

Timeline · 3 updates

identified Jun 26, 2025, 06:53 AM UTC

Some users may experience login issues due to a disruption in our authentication provider. We’re actively monitoring the situation.
monitoring Jun 26, 2025, 07:12 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Jun 26, 2025, 07:34 AM UTC

This incident has been resolved.

Read the full incident report →

Minor June 12, 2025

QStash: Degraded performance

Detected by Pingoru: Jun 12, 2025, 02:28 PM UTC
Resolved: Jun 12, 2025, 04:07 PM UTC
Duration: 1h 39m

Affected: EU-CENTRAL-1

Timeline · 3 updates

investigating Jun 12, 2025, 02:28 PM UTC

We are currently investigating this issue.
monitoring Jun 12, 2025, 02:40 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Jun 12, 2025, 04:07 PM UTC

This incident has been resolved.

Read the full incident report →

Critical June 11, 2025

Degraded Performance

Detected by Pingoru: Jun 11, 2025, 06:51 AM UTC
Resolved: Jun 11, 2025, 10:02 AM UTC
Duration: 3h 10m

Affected: N. Virginia, USA (us-east-1)N. California, USA (us-west-1)Oregon, USA (us-west-2)Frankfurt, Germany (eu-central-1)Ireland (eu-west-1)Singapore (ap-southeast-1)Sydney, Australia (ap-southeast-2)Mumbai, India (ap-south-1)Tokyo, Japan (ap-northeast-1)São Paulo, Brazil (sa-east-1)Ohio, USA (us-east-2)London, UK (eu-west-2)

Timeline · 5 updates

investigating Jun 11, 2025, 06:51 AM UTC

We are currently investigating this issue.
investigating Jun 11, 2025, 08:24 AM UTC

We are continuing to investigate this issue.
monitoring Jun 11, 2025, 09:20 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Jun 11, 2025, 10:02 AM UTC

This incident has been resolved.
postmortem Jun 11, 2025, 11:49 AM UTC

A routine system maintenance operation at the OS level led to the application of system updates across multiple EC2 instances in our clusters in several AWS regions. These updates included changes to networking components, which inadvertently triggered restarts. As a result, several EC2 nodes failed health checks and temporarily dropped out of the cluster, disrupting high availability and causing partial connectivity issues for some clients and operations. We have since reproduced the issue in a controlled environment and verified the root cause. To prevent a recurrence, we are updating our node maintenance strategy to ensure greater control over the timing and impact of system-level changes and excluding networking components from automated upgrades.

Read the full incident report →

Minor June 10, 2025

Performance Degradation

Detected by Pingoru: Jun 10, 2025, 07:23 AM UTC
Resolved: Jun 10, 2025, 12:08 PM UTC
Duration: 4h 45m

Affected: N. Virginia, USA (us-east-1)

Timeline · 4 updates

investigating Jun 10, 2025, 07:23 AM UTC

We are currently investigating this issue.
monitoring Jun 10, 2025, 07:45 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Jun 10, 2025, 12:08 PM UTC

This incident has been resolved.
postmortem Jun 11, 2025, 11:37 AM UTC

A routine system maintenance operation at the OS level led to the application of system updates across multiple EC2 instances in our clusters in several AWS regions. These updates included changes to networking components, which inadvertently triggered restarts. As a result, several EC2 nodes failed health checks and temporarily dropped out of the cluster, disrupting high availability and causing partial connectivity issues for some clients and operations. We have since reproduced the issue in a controlled environment and verified the root cause. To prevent a recurrence, we are updating our node maintenance strategy to ensure greater control over the timing and impact of system-level changes and excluding networking components from automated upgrades.

Read the full incident report →

Minor May 22, 2025

Global Ireland (eu-west-1) Degraded Performance

Detected by Pingoru: May 22, 2025, 04:44 PM UTC
Resolved: May 22, 2025, 05:41 PM UTC
Duration: 56m

Affected: Ireland (eu-west-1)

Timeline · 2 updates

monitoring May 22, 2025, 04:44 PM UTC

A fix has been implemented and we are monitoring the results.
resolved May 22, 2025, 05:41 PM UTC

This incident has been resolved.

Read the full incident report →

Minor April 2, 2025

QStash: Degraded performance in request processing and event logs

Detected by Pingoru: Apr 02, 2025, 07:15 AM UTC
Resolved: Apr 02, 2025, 09:12 PM UTC
Duration: 13h 57m

Affected: EU-CENTRAL-1

Timeline · 9 updates

Read the full incident report →

Minor March 12, 2025

Performance degradation on QStash

Detected by Pingoru: Mar 12, 2025, 09:43 AM UTC
Resolved: Mar 12, 2025, 10:16 AM UTC
Duration: 33m

Affected: EU-CENTRAL-1

Timeline · 4 updates

investigating Mar 12, 2025, 09:43 AM UTC

We are currently investigating this issue.
monitoring Mar 12, 2025, 10:08 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Mar 12, 2025, 10:16 AM UTC

This incident has been resolved.
postmortem Mar 12, 2025, 12:26 PM UTC

On 09:43 UTC, QStash experienced degraded service when a high number of requests to a specific domain were throttled, resulting in timeouts during an unexpected phase of the TCP connection establishment. These requests and resulting retries triggered excessive consumption of network resources and negatively impacted all users. We have added more resources to QStash as a quick remediation and as for the resolution, we have improved the timeout mechanism to detect and fail-faster in such cases.

Read the full incident report →