LiveKit Outage History

LiveKit had 66 outages in the last 2 years totaling 97h 31m of downtime — averaging 2.7 incidents per month.

There were 66 LiveKit outages since July 30, 2025 totaling 97h 31m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.livekit.io

Minor March 14, 2026

Increased participant subscription failures

Detected by Pingoru: Mar 14, 2026, 12:56 AM UTC
Resolved: Mar 13, 2026, 08:00 PM UTC
Duration: —

Timeline · 1 update

resolved Mar 14, 2026, 12:56 AM UTC

Between 21:00 and 21:15 UTC, a limited amount of participant subscriptions failed across regions, along with increased subscription latency to subscribers in the Chicago region

Read the full incident report →

Notice March 2, 2026

Inbound SIP (Twilio/UDP) Degradation | US East

Detected by Pingoru: Mar 02, 2026, 10:30 AM UTC
Resolved: Mar 02, 2026, 10:30 AM UTC
Duration: —

Timeline · 2 updates

Read the full incident report →

Minor February 27, 2026

Analytics dashboard updates delayed

Detected by Pingoru: Feb 27, 2026, 12:18 AM UTC
Resolved: Feb 27, 2026, 02:40 AM UTC
Duration: 2h 22m

Affected: Cloud Dashboard (cloud.livekit.io)

Timeline · 2 updates

identified Feb 27, 2026, 12:18 AM UTC

Updates to the LiveKit Cloud analytics dashboards are currently delayed due to an issue with our processing pipeline. Ingestion of new data is not affected, and we are in the process of recovering all data. We'll share another update once the restoration is complete and dashboards are updating normally.
resolved Feb 27, 2026, 02:40 AM UTC

Analytics dashboards are updating normally again.

Read the full incident report →

Notice February 20, 2026

Reports of SIP INVITES not getting responses in Chicago region

Detected by Pingoru: Feb 20, 2026, 09:47 PM UTC
Resolved: Feb 21, 2026, 04:54 AM UTC
Duration: 7h 7m

Affected: US Central - SIP

Timeline · 4 updates

investigating Feb 20, 2026, 09:47 PM UTC

We are currently investigating user reports of SIP invites not getting responses in our Chicago region. We have not yet determined if the issue is due to the users' trunking providers, but we have begun routing traffic away from Chicago to our other US clusters to ensure continued service.
monitoring Feb 20, 2026, 10:41 PM UTC

The users who made the original reports are no longer seeing issues. We will continue investigating and post the results of our investigation as soon as they are available, but there are currently no known issues in LiveKit's SIP infrastructure.
resolved Feb 21, 2026, 04:54 AM UTC

We are going to keep traffic routing to our other US clusters, but we will consider this incident closed while we continue to investigate its origin as there is currently no impact to users.
postmortem Mar 06, 2026, 09:20 AM UTC

**Root Cause** This issue had the same root cause as the [incident](https://status.livekit.io/incidents/xkcpnmycy5m3) that occurred after it in the US east region. Please view that link for a better understanding of the root cause, technical details, monitoring and mitigations. **Timeline** 2026-02-20 19:00 UTC – Isolated customer reports came in of no sip responses to invites 2026-02-20 20:34 UTC – Looking at various customer examples, we figured out that the common factor in problematic calls was the chicago based SIP loadbalancer. 2026-02-20 21:26 UTC – SIP in Chicago was drained after ensuring it there was enough capacity in other regions.

Read the full incident report →

Major February 20, 2026

8% of RTC connections hanging in US-East

Detected by Pingoru: Feb 20, 2026, 09:12 AM UTC
Resolved: Feb 14, 2026, 04:00 PM UTC
Duration: —

Timeline · 2 updates

Read the full incident report →

Minor January 28, 2026

SIP INVITE requests unacknowledged in the EU

Detected by Pingoru: Jan 28, 2026, 04:45 PM UTC
Resolved: Jan 28, 2026, 09:45 PM UTC
Duration: 4h 59m

Affected: Europe Central - SIPEurope South - SIPUnited Kingdom - SIP

Timeline · 8 updates

Read the full incident report →

Minor January 23, 2026

Elevated API latency affecting Real Time Communication services in London Region

Detected by Pingoru: Jan 23, 2026, 02:39 PM UTC
Resolved: Jan 23, 2026, 10:24 PM UTC
Duration: 7h 45m

Affected: United Kingdom - Real Time Communication

Timeline · 6 updates

investigating Jan 23, 2026, 02:39 PM UTC

We are currently investigating reports of elevated API latency in the UK region.
identified Jan 23, 2026, 02:50 PM UTC

We are routing traffic away from the London cluster to mitigate impact.
identified Jan 23, 2026, 04:21 PM UTC

We are continuing to investigate the root cause of the performance degradation of the London cluster. We will continue to share updates as we learn more.
monitoring Jan 23, 2026, 06:38 PM UTC

All non-pinned traffic has been healthy since we began routing traffic through the nearest healthy nodes at 14:50 UTC. Our team is still working to bring the London cluster back to full health.
resolved Jan 23, 2026, 10:24 PM UTC

We brought the London cluster back to full health at 21:40 UTC. This incident is now resolved. We will publish a postmortem outlining the root cause and the steps we will take to ameliorate this issue in the future as soon as possible.
postmortem Jan 27, 2026, 09:35 PM UTC

Our cloud provider’s managed Kubernetes control plane experienced an issue that caused connectivity to cluster nodes to be disrupted, dropping some of the workloads in the cluster. Due to a bug in our auto-draining logic, detection of the issue was delayed. We have since fixed the alerting to ensure faster detection and response if a similar situation occur should occur in the future.

Read the full incident report →

Minor January 22, 2026

Issues with a subset of inference requests to gemini models.

Detected by Pingoru: Jan 22, 2026, 06:29 PM UTC
Resolved: Jan 22, 2026, 07:46 PM UTC
Duration: 1h 16m

Affected: Global Inference

Timeline · 3 updates

identified Jan 22, 2026, 06:29 PM UTC

We're investigating an issue with a subset of inference calls made to gemini models.
monitoring Jan 22, 2026, 06:34 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Jan 22, 2026, 07:46 PM UTC

This incident has been resolved.

Read the full incident report →

Major January 9, 2026

SIP outbound call issues

Detected by Pingoru: Jan 09, 2026, 10:00 AM UTC
Resolved: Jan 09, 2026, 10:24 AM UTC
Duration: 24m

Affected: US West - SIPUS East - SIPUS Central - SIP

Timeline · 5 updates

Read the full incident report →

Minor January 7, 2026

US East RTC: Short-Lived Connectivity Interruption

Detected by Pingoru: Jan 07, 2026, 04:00 PM UTC
Resolved: Jan 07, 2026, 04:00 PM UTC
Duration: —

Timeline · 1 update

resolved Jan 07, 2026, 08:40 PM UTC

Between 16:01 UTC and 16:05 UTC, ~9-10% of participants connecting to the US East region experienced disconnection/reconnection to the RTC service. Client SDKs automatically reconnect and the majority of affected users' access was restored by 16:05 UTC. Cause A code change from a new feature was the root cause of the issue. The new feature is behind a flag which is disabled by default. But, changes to code to support the new feature affected an existing path that did not implement the feature flag, and triggered its use by some mobile phone clients. Resolution The deployment that caused the issue was rolled back. The erroneous code path is being fixed and more tests will be added to ensure that it exhibits expected behavior.

Read the full incident report →

Minor December 31, 2025

Subset of inbound calls not being handled

Detected by Pingoru: Dec 31, 2025, 07:24 PM UTC
Resolved: Jan 01, 2026, 04:02 AM UTC
Duration: 8h 37m

Affected: Global SIP

Timeline · 4 updates

Read the full incident report →

Major December 28, 2025

Connection errors from Chicago region

Detected by Pingoru: Dec 28, 2025, 12:29 AM UTC
Resolved: Dec 28, 2025, 12:37 AM UTC
Duration: 8m

Affected: US Central - Real Time Communication

Timeline · 4 updates

investigating Dec 28, 2025, 12:29 AM UTC

we are investigating reports of connection failures originating from the Chicago region.
identified Dec 28, 2025, 12:34 AM UTC

the error is due to an expired certificate in the US Central cluster, we are routing traffic around the region
resolved Dec 28, 2025, 12:37 AM UTC

we've resolved the root cause of the expiring cert. service has been fully restored. this impacted core connectivity and API in the US Central (Chicago) region, including RTC, SIP, Phone numbers, Egress, and Ingress post mortem will follow shortly.
postmortem Dec 28, 2025, 06:22 PM UTC

The service we use to auto-renew certificates throttles aggressively. When a request to renew a cert is throttled or failed, we automatically retry to renew the certificate. This process usually resolves and the cert is renewed. However, in some cases the renewal process can get stuck; we have alerting set up to catch these instances. Unfortunately a logic error was introduced into this alert and in this case a stuck renewal process did not alert and the certificate eventually expired. We have validated that no other certs are currently in this state. We have also resolved the issue with the alerting. In the future, we'll ensure to monitor changes made to this alerting process to ensure this does not happen again.

Read the full incident report →

Minor December 11, 2025

Elevated setup times for Egress and SIP in Frankfurt region

Detected by Pingoru: Dec 11, 2025, 04:02 PM UTC
Resolved: Dec 12, 2025, 06:42 PM UTC
Duration: 1d 2h

Affected: Europe Central - SIPEurope Central - Egress

Timeline · 2 updates

investigating Dec 12, 2025, 07:37 PM UTC

We are currently investigating reports of increased API response times for Egress and SIP in our Frankfurt region.
resolved Dec 12, 2025, 11:49 PM UTC

This incident has been resolved. p99 latency between services in the cluster and the DB instance in the eu-central region was elevated from a single cluster. Traffic has been routed around the cluster.

Read the full incident report →

Minor December 7, 2025

Analytics data is delayed

Detected by Pingoru: Dec 07, 2025, 08:07 PM UTC
Resolved: Dec 08, 2025, 03:10 PM UTC
Duration: 19h 3m

Affected: Cloud Dashboard (cloud.livekit.io)

Timeline · 3 updates

identified Dec 07, 2025, 08:07 PM UTC

we are observing delays in the analytics data that's visible from the dashboard. we are working on processing the backlog of data. this impacts Session detail page, Egresses, Ingresses, and SIP dashboards. Sessions listing and Agent insights are not affected.
identified Dec 08, 2025, 08:29 AM UTC

we've implemented fixes to help the data processor to catch up. live processing will be enabled soon.
resolved Dec 08, 2025, 03:10 PM UTC

A fix has been implemented. The processing is back online, and will catch up to real time We will backfill the data for the affected duration next

Read the full incident report →

Minor December 5, 2025

Cloud Dashboard Outage Due to Cloudflare Outage

Detected by Pingoru: Dec 05, 2025, 09:00 AM UTC
Resolved: Dec 05, 2025, 09:30 AM UTC
Duration: 30m

Timeline · 1 update

resolved Dec 05, 2025, 09:37 AM UTC

User traffic was not impacted, but our customer dashboard was not operational because of a major Cloudflare outage. It lasted roughly from 08:45 - 09:15 UTC. https://www.cloudflarestatus.com/incidents/k9ppxftx8bs5 has some details, though it appears the issue may have impacted Cloudflare caching as well. It appears Docker hub may have been impacted at the same time https://www.dockerstatus.com/pages/incident/533c6539221ae15e3f000031/6932a03b2276fb05ff7ee506 but at this point we see no production impact of this on Livekit.

Read the full incident report →

Notice December 4, 2025

Network instability in some Asia regions

Detected by Pingoru: Dec 04, 2025, 05:46 AM UTC
Resolved: Dec 03, 2025, 03:00 PM UTC
Duration: —

Timeline · 1 update

resolved Dec 04, 2025, 05:46 AM UTC

On 2025-12-03 from 15:06 to 15:26 UTC, network connectivity was flapping in our India, Israel, Saudi Arabia and UAE regions. During that period, attempts to initiate RTC connections or API requests to those regions would have failed intermittently. Clients failing to connect would have been routed to a nearby available region on a subsequent attempt. Clients would have experienced a slightly higher join time because of that.

Read the full incident report →

Minor November 18, 2025

Web page outage

Detected by Pingoru: Nov 18, 2025, 07:42 PM UTC
Resolved: Nov 18, 2025, 07:42 PM UTC
Duration: —

Timeline · 1 update

resolved Nov 18, 2025, 07:42 PM UTC

At 11:48 UTC a Cloudflare outage affected the ability to access LiveKit web pages, including www.livekit.io and our Cloud Dashboard (cloud.livekit.io). All other systems remained functional and were unaffected. End user services, including RTC, Ingress, Egress, Hosted Agents, and Inference Gateway remained functional throughout the outage. The issue was resolved by Cloudflare by 17:44 UTC.

Read the full incident report →

Minor November 10, 2025

US - Issues with Track Egress starts

Detected by Pingoru: Nov 10, 2025, 04:29 PM UTC
Resolved: Nov 10, 2025, 06:30 PM UTC
Duration: 2h 1m

Affected: US West - EgressUS East - EgressUS Central - Egress

Timeline · 3 updates

Read the full incident report →

Minor November 6, 2025

EU Central - Cloud Agents creation issues

Detected by Pingoru: Nov 06, 2025, 03:39 PM UTC
Resolved: Nov 06, 2025, 10:03 PM UTC
Duration: 6h 24m

Affected: Europe Central - Cloud Agents

Timeline · 4 updates

investigating Nov 06, 2025, 03:39 PM UTC

We are investigating intermittent issues with creating new cloud agents in EU Central. Running agents in this region are not affected.
identified Nov 06, 2025, 04:38 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Nov 06, 2025, 07:28 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Nov 06, 2025, 10:03 PM UTC

This incident has been resolved. An upstream service provider was having intermittent timeout issues. We've added retries to this interaction and we have not seen a reoccurrence.

Read the full incident report →

Minor October 31, 2025

Cloud Agents in us-east experiencing update issues

Detected by Pingoru: Oct 31, 2025, 12:37 PM UTC
Resolved: Oct 31, 2025, 02:35 PM UTC
Duration: 1h 57m

Affected: US East - Cloud Agents

Timeline · 5 updates

investigating Oct 31, 2025, 12:37 PM UTC

We are currently investigating the issue.
monitoring Oct 31, 2025, 01:04 PM UTC

A fix has been implemented and we are monitoring the results.
identified Oct 31, 2025, 01:10 PM UTC

The Reconciler service experienced an interruption, preventing certain changes (e.g., update-secret, deployments, etc.) from being applied to agents during the incident. Creation of new agents was not affected, and only a subset of existing agents were impacted.
monitoring Oct 31, 2025, 01:10 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Oct 31, 2025, 02:35 PM UTC

This incident has been resolved.

Read the full incident report →

Notice October 31, 2025

Frankfurt Cloud Provider Outage

Detected by Pingoru: Oct 31, 2025, 09:05 AM UTC
Resolved: Oct 31, 2025, 09:05 AM UTC
Duration: —

Affected: Europe Central - SIPEurope Central - TURNEurope Central - EgressEurope Central - IngressEurope Central - Real Time Communication

Timeline · 1 update

resolved Oct 31, 2025, 09:05 AM UTC

On 2025-10-31 from 06:42 to 06:55 UTC, we detected a cloud provider outage in Frankfurt. We automatically routed around this. After recovery, we noticed elevated RTT to other European clusters, so as a precaution we kept Frankfurt drained until the network proved stable. It was undrained at 09:08.

Read the full incident report →

Notice October 30, 2025

Saudi Arabia Cloud Provider Outage

Detected by Pingoru: Oct 30, 2025, 08:36 AM UTC
Resolved: Oct 30, 2025, 08:36 AM UTC
Duration: —

Affected: Saudi Arabia - IngressSaudi Arabia - SIPSaudi Arabia - EgressSaudi Arabia - TURNSaudi Arabia - Real Time CommunicationSaudi Arabia - Analytics Ingestion

Timeline · 1 update

resolved Oct 30, 2025, 08:36 AM UTC

Our Cloud Provider in Saudi Arabia went down from 7:30 - 8:20 UTC. We automatically route around these issues for unrestricted traffic, but customers with compliance requirements will have noticed a disruption of service.

Read the full incident report →

Major October 28, 2025

Issues with SIP inbound calls

Detected by Pingoru: Oct 28, 2025, 06:26 PM UTC
Resolved: Oct 28, 2025, 08:15 PM UTC
Duration: 1h 48m

Affected: Global SIP

Timeline · 4 updates

Read the full incident report →

Minor October 22, 2025

Analytics dashboard is down

Detected by Pingoru: Oct 22, 2025, 02:05 AM UTC
Resolved: Oct 22, 2025, 05:10 AM UTC
Duration: 3h 5m

Affected: Cloud Dashboard (cloud.livekit.io)

Timeline · 5 updates

investigating Oct 22, 2025, 02:05 AM UTC

We are currently investigating the issue. This affects only the analytics dashboard and api. Real time communications is operational
identified Oct 22, 2025, 02:10 AM UTC

The issue has been identified. Dashboard is now accessible. We are working on restarting the processing pipeline
identified Oct 22, 2025, 02:10 AM UTC

We are continuing to work on a fix for this issue.
monitoring Oct 22, 2025, 04:49 AM UTC

A fix has been implemented. We are continuing to monitor the situation
resolved Oct 22, 2025, 05:10 AM UTC

This incident has been resolved. We will backfill/restore the data

Read the full incident report →

Minor October 7, 2025

Issues with updating cloud agents

Detected by Pingoru: Oct 07, 2025, 06:06 PM UTC
Resolved: Oct 07, 2025, 09:25 PM UTC
Duration: 3h 19m

Affected: Global Cloud Agents

Timeline · 4 updates

investigating Oct 07, 2025, 06:06 PM UTC

Issues with updating existing cloud agents. Running agents are still working as expected.
identified Oct 07, 2025, 07:02 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Oct 07, 2025, 08:29 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Oct 07, 2025, 09:25 PM UTC

This incident has been resolved.

Read the full incident report →