- Detected by Pingoru
- May 14, 2026, 03:47 PM UTC
- Resolved
- May 14, 2026, 05:31 PM UTC
- Duration
- 1h 44m
Affected: N. Virginia, USA (us-east-1)N. California, USA (us-west-1)Sydney, Australia (ap-southeast-2)Mumbai, India (ap-south-1)
Timeline · 4 updates
-
monitoring May 14, 2026, 03:47 PM UTC
We identified intermittent connection issues affecting some regions. Fix is being deployed. We'll update the list of affected regions shortly.
-
identified May 14, 2026, 03:49 PM UTC
The issue has been identified and a fix is being implemented.
-
monitoring May 14, 2026, 04:00 PM UTC
A fix has been implemented and we are monitoring.
-
resolved May 14, 2026, 05:31 PM UTC
Earlier today, unexpected load on our proxies caused intermittent connection issues for Upstash Redis in the following regions: us-east-1, us-west-1, ap-southeast-2, and ap-south-1. During this period, some clients may have seen connection timeouts or elevated error rates when reaching their databases. Our team identified the issue quickly and applied workarounds to relieve pressure on the affected proxies. Connection health has since been restored and we've been monitoring the regions to confirm everything is stable. All systems are now operating normally. We appreciate your patience and apologize for any disruption this may have caused.
Read the full incident report →
- Detected by Pingoru
- May 12, 2026, 09:49 AM UTC
- Resolved
- May 12, 2026, 01:48 PM UTC
- Duration
- 3h 59m
Timeline · 5 updates
-
investigating May 12, 2026, 09:49 AM UTC
Some regions are experiencing connectivity issues due to an ongoing network problem. We are currently investigating
-
identified May 12, 2026, 10:50 AM UTC
The issue has been identified and the fix is being implemented.
-
monitoring May 12, 2026, 12:05 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved May 12, 2026, 01:48 PM UTC
This incident has been resolved.
-
postmortem May 15, 2026, 12:43 PM UTC
On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.
Read the full incident report →
- Detected by Pingoru
- May 11, 2026, 03:05 PM UTC
- Resolved
- May 11, 2026, 06:22 PM UTC
- Duration
- 3h 16m
Timeline · 5 updates
-
investigating May 11, 2026, 03:05 PM UTC
Some databases may experience increased latency or timeouts in Fly.io’s FRA region.
-
investigating May 11, 2026, 03:07 PM UTC
We are continuing to investigate the issue.
-
investigating May 11, 2026, 05:19 PM UTC
We are working with Fly team to investigate the root cause.
-
resolved May 11, 2026, 06:22 PM UTC
The incident has been resolved. We are working with Fly team on RCA.
-
postmortem May 15, 2026, 12:49 PM UTC
On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.
Read the full incident report →
- Detected by Pingoru
- May 08, 2026, 09:46 AM UTC
- Resolved
- May 08, 2026, 10:08 AM UTC
- Duration
- 22m
Affected: US-EAST-1
Timeline · 5 updates
Read the full incident report →
- Detected by Pingoru
- May 05, 2026, 02:24 PM UTC
- Resolved
- May 06, 2026, 08:05 AM UTC
- Duration
- 17h 40m
Affected: US-EAST-1
Timeline · 6 updates
Read the full incident report →
- Detected by Pingoru
- Apr 27, 2026, 04:09 PM UTC
- Resolved
- Apr 27, 2026, 07:13 PM UTC
- Duration
- 3h 4m
Affected: US-EAST-1
Timeline · 3 updates
-
identified Apr 27, 2026, 04:09 PM UTC
Some users on QStash US might be experiencing schedule issues.
-
monitoring Apr 27, 2026, 07:04 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Apr 27, 2026, 07:13 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Apr 02, 2026, 06:45 PM UTC
- Resolved
- Apr 02, 2026, 07:53 PM UTC
- Duration
- 1h 8m
Timeline · 2 updates
-
identified Apr 02, 2026, 06:45 PM UTC
Servers in the iad region experienced unexpected disk load, resulting in elevated latencies and a temporary read-only state. We are migrating replicas to new instances to mitigate the issue and expect to have it fully resolved shortly.
-
resolved Apr 02, 2026, 07:53 PM UTC
Replication complete, incident resolved.
Read the full incident report →
- Detected by Pingoru
- Mar 27, 2026, 05:05 PM UTC
- Resolved
- Mar 27, 2026, 03:30 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Mar 27, 2026, 05:05 PM UTC
Due to a race condition in a process that attaches static IPs to nodes, some of the IPs in the dns were detached from the nodes, causing timeouts.
Read the full incident report →
- Detected by Pingoru
- Mar 06, 2026, 01:50 PM UTC
- Resolved
- Mar 06, 2026, 02:49 PM UTC
- Duration
- 59m
Affected: Tokyo, Japan (ap-northeast-1)
Timeline · 4 updates
-
investigating Mar 06, 2026, 01:50 PM UTC
We are currently investigating this issue.
-
monitoring Mar 06, 2026, 02:08 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Mar 06, 2026, 02:49 PM UTC
This incident has been resolved.
-
postmortem Mar 09, 2026, 10:07 AM UTC
On March 6, between approximately 13:44–14:07 UTC, some databases experienced elevated latency and connection errors in the Tokyo \(ap-northeast-1\) region. The issue was caused by a sudden spike in traffic that significantly increased network utilization and connection load on a subset of nodes. Our team mitigated the incident by scaling up capacity in the region and redistributing load across additional nodes. Service recovered once the additional capacity was brought online. Resolution We have increased the number of machines in the Tokyo region to provide additional headroom and reduce the likelihood of similar incidents during traffic spikes. Next Steps We are continuing to review capacity safeguards and connection-handling limits to improve resilience against sudden traffic surges.
Read the full incident report →
- Detected by Pingoru
- Jan 23, 2026, 05:05 PM UTC
- Resolved
- Jan 23, 2026, 03:30 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Jan 23, 2026, 05:05 PM UTC
We identified the cause of elevated latency impacting some databases in us-east-1 region between 15:30–15:35 UTC as a sudden surge of connection attempts that hit OS-level connection limits on our proxy layer. This resulted in slower new connection establishment and increased latency for some requests. Databases were not impacted. We are implementing additional proxy-level metrics and safeguards to detect and manage similar edge cases earlier.
Read the full incident report →
- Detected by Pingoru
- Dec 12, 2025, 02:34 PM UTC
- Resolved
- Dec 01, 2025, 07:00 AM UTC
- Duration
- —
Timeline · 1 update
-
resolved Dec 12, 2025, 02:34 PM UTC
We identified and fixed a bug that could cause messages with Flow Control enabled to be delayed longer than their configured delay, resulting in unexpectedly long pending times. The fix is in place and the issue should not recur. If you’re still seeing unusually long-delayed messages, please contact [email protected] and we can help with remediation.
Read the full incident report →
- Detected by Pingoru
- Dec 10, 2025, 11:50 AM UTC
- Resolved
- Dec 10, 2025, 12:08 PM UTC
- Duration
- 17m
Affected: AWS US-EAST-1
Timeline · 2 updates
-
investigating Dec 10, 2025, 11:50 AM UTC
We identified an issue in Regional Databases in US-East-1 where database replicase have connectivity issues with each other. Other regions are not impacted. Global databases are not impacted. We are working on the issue.
-
resolved Dec 10, 2025, 12:08 PM UTC
Issue has been identified and replicas were successfully reconnected.
Read the full incident report →
Critical December 5, 2025 - Detected by Pingoru
- Dec 05, 2025, 09:03 AM UTC
- Resolved
- Dec 05, 2025, 09:18 AM UTC
- Duration
- 14m
Affected: Upstash ConsoleContext7 Console
Timeline · 2 updates
-
investigating Dec 05, 2025, 09:03 AM UTC
Upstream provider confirmed an incident. We are investigating the impact and potential resolutions.
-
resolved Dec 05, 2025, 09:18 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Oct 20, 2025, 02:52 PM UTC
- Resolved
- Oct 20, 2025, 06:50 PM UTC
- Duration
- 3h 58m
Affected: Upstash Console
Timeline · 4 updates
-
investigating Oct 20, 2025, 02:52 PM UTC
We are currently experiencing errors in the Upstash Console due to issues with one of our upstream providers. This may affect access to the dashboard and related operations. Our team is actively monitoring the situation and working to mitigate the impact. We will provide updates as soon as more information becomes available.
-
monitoring Oct 20, 2025, 03:05 PM UTC
Console is back to normal again. We are currently monitoring.
-
monitoring Oct 20, 2025, 06:07 PM UTC
Developer API (api.upstash.com) is having availability issues alongside Upstash Console. While we are monitoring the underlying cloud provider's status updates, we are also working on a remediation.
-
resolved Oct 20, 2025, 06:50 PM UTC
A fix has been deployed as a workaround so that our systems are not affected from the ongoing incident of the cloud provider
Read the full incident report →
Critical October 20, 2025 - Detected by Pingoru
- Oct 20, 2025, 07:00 AM UTC
- Resolved
- Oct 20, 2025, 07:00 AM UTC
- Duration
- —
Timeline · 1 update
-
resolved Oct 20, 2025, 10:29 AM UTC
As a side effect of an incident on the underlying cloud provider, Upstash Console has had availability issues between 07:00UTC and 09:23UTC. Only Upstash Console is impacted, Upstash products remained operational.
Read the full incident report →
- Detected by Pingoru
- Oct 10, 2025, 04:47 PM UTC
- Resolved
- Oct 10, 2025, 03:46 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Oct 10, 2025, 04:47 PM UTC
Between 15:46–15:55 UTC, some client connection attempts to databases in us-east-1 timed out due to unexpected high load on a server. The node was recovered at 15:50 UTC, and the updated DNS record propagated by 15:55 UTC. Services are operating normally.
Read the full incident report →
- Detected by Pingoru
- Sep 09, 2025, 03:48 PM UTC
- Resolved
- Sep 09, 2025, 12:00 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Sep 09, 2025, 03:48 PM UTC
Impact: A subset of clients connecting through the eu-central-1 region experienced increased error rates and timeouts when accessing certain databases. Clients in us-west-2 were also briefly affected. The issue was limited in scope and did not impact other regions. Root Cause: During an ongoing migration to improve database routing reliability, a configuration step was applied inconsistently across regions. Resolution: Our monitoring alerted us within minutes, and the migration was promptly rolled back for the affected regions. Service definitions were restored, and normal database connectivity resumed by 15:08 UTC. Next Steps: We are reviewing our migration process to ensure consistency across all regions and adding additional safeguards to prevent similar issues in the future.
Read the full incident report →
- Detected by Pingoru
- Aug 18, 2025, 12:37 PM UTC
- Resolved
- Aug 18, 2025, 12:37 PM UTC
- Duration
- —
Affected: N. Virginia, USA (us-east-1)
Timeline · 2 updates
-
resolved Aug 18, 2025, 01:30 PM UTC
Between 12:37 UTC and 12:50 UTC, an overload in the connection proxying system resulted in instability for a subset of databases. The root cause was identified as a misconfiguration in the routing rules, which caused certain requests to experience timeouts during the initial phase of a gradual deployment. Upon detection, the deployment was immediately rolled back, restoring normal service. We are reviewing our deployment and configuration validation processes to prevent similar issues in the future.
-
postmortem Aug 18, 2025, 01:30 PM UTC
Between 12:37 UTC and 12:50 UTC, an overload in the connection proxying system resulted in instability for a subset of databases. The root cause was identified as a misconfiguration in the routing rules, which caused certain requests to experience timeouts during the initial phase of a gradual deployment. Upon detection, the deployment was immediately rolled back, restoring normal service. We are reviewing our deployment and configuration validation processes to prevent similar issues in the future.
Read the full incident report →
- Detected by Pingoru
- Jun 26, 2025, 06:53 AM UTC
- Resolved
- Jun 26, 2025, 07:34 AM UTC
- Duration
- 41m
Affected: Upstash Console
Timeline · 3 updates
-
identified Jun 26, 2025, 06:53 AM UTC
Some users may experience login issues due to a disruption in our authentication provider. We’re actively monitoring the situation.
-
monitoring Jun 26, 2025, 07:12 AM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Jun 26, 2025, 07:34 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jun 12, 2025, 02:28 PM UTC
- Resolved
- Jun 12, 2025, 04:07 PM UTC
- Duration
- 1h 39m
Affected: EU-CENTRAL-1
Timeline · 3 updates
-
investigating Jun 12, 2025, 02:28 PM UTC
We are currently investigating this issue.
-
monitoring Jun 12, 2025, 02:40 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Jun 12, 2025, 04:07 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jun 11, 2025, 06:51 AM UTC
- Resolved
- Jun 11, 2025, 10:02 AM UTC
- Duration
- 3h 10m
Affected: N. Virginia, USA (us-east-1)N. California, USA (us-west-1)Oregon, USA (us-west-2)Frankfurt, Germany (eu-central-1)Ireland (eu-west-1)Singapore (ap-southeast-1)Sydney, Australia (ap-southeast-2)Mumbai, India (ap-south-1)Tokyo, Japan (ap-northeast-1)São Paulo, Brazil (sa-east-1)Ohio, USA (us-east-2)London, UK (eu-west-2)
Timeline · 5 updates
-
investigating Jun 11, 2025, 06:51 AM UTC
We are currently investigating this issue.
-
investigating Jun 11, 2025, 08:24 AM UTC
We are continuing to investigate this issue.
-
monitoring Jun 11, 2025, 09:20 AM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Jun 11, 2025, 10:02 AM UTC
This incident has been resolved.
-
postmortem Jun 11, 2025, 11:49 AM UTC
A routine system maintenance operation at the OS level led to the application of system updates across multiple EC2 instances in our clusters in several AWS regions. These updates included changes to networking components, which inadvertently triggered restarts. As a result, several EC2 nodes failed health checks and temporarily dropped out of the cluster, disrupting high availability and causing partial connectivity issues for some clients and operations. We have since reproduced the issue in a controlled environment and verified the root cause. To prevent a recurrence, we are updating our node maintenance strategy to ensure greater control over the timing and impact of system-level changes and excluding networking components from automated upgrades.
Read the full incident report →
- Detected by Pingoru
- Jun 10, 2025, 07:23 AM UTC
- Resolved
- Jun 10, 2025, 12:08 PM UTC
- Duration
- 4h 45m
Affected: N. Virginia, USA (us-east-1)
Timeline · 4 updates
-
investigating Jun 10, 2025, 07:23 AM UTC
We are currently investigating this issue.
-
monitoring Jun 10, 2025, 07:45 AM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Jun 10, 2025, 12:08 PM UTC
This incident has been resolved.
-
postmortem Jun 11, 2025, 11:37 AM UTC
A routine system maintenance operation at the OS level led to the application of system updates across multiple EC2 instances in our clusters in several AWS regions. These updates included changes to networking components, which inadvertently triggered restarts. As a result, several EC2 nodes failed health checks and temporarily dropped out of the cluster, disrupting high availability and causing partial connectivity issues for some clients and operations. We have since reproduced the issue in a controlled environment and verified the root cause. To prevent a recurrence, we are updating our node maintenance strategy to ensure greater control over the timing and impact of system-level changes and excluding networking components from automated upgrades.
Read the full incident report →
- Detected by Pingoru
- May 22, 2025, 04:44 PM UTC
- Resolved
- May 22, 2025, 05:41 PM UTC
- Duration
- 56m
Affected: Ireland (eu-west-1)
Timeline · 2 updates
-
monitoring May 22, 2025, 04:44 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved May 22, 2025, 05:41 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Apr 02, 2025, 07:15 AM UTC
- Resolved
- Apr 02, 2025, 09:12 PM UTC
- Duration
- 13h 57m
Affected: EU-CENTRAL-1
Timeline · 9 updates
Read the full incident report →
- Detected by Pingoru
- Mar 12, 2025, 09:43 AM UTC
- Resolved
- Mar 12, 2025, 10:16 AM UTC
- Duration
- 33m
Affected: EU-CENTRAL-1
Timeline · 4 updates
-
investigating Mar 12, 2025, 09:43 AM UTC
We are currently investigating this issue.
-
monitoring Mar 12, 2025, 10:08 AM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Mar 12, 2025, 10:16 AM UTC
This incident has been resolved.
-
postmortem Mar 12, 2025, 12:26 PM UTC
On 09:43 UTC, QStash experienced degraded service when a high number of requests to a specific domain were throttled, resulting in timeouts during an unexpected phase of the TCP connection establishment. These requests and resulting retries triggered excessive consumption of network resources and negatively impacted all users. We have added more resources to QStash as a quick remediation and as for the resolution, we have improved the timeout mechanism to detect and fail-faster in such cases.
Read the full incident report →