LiveKit Outage History

LiveKit is up right now

LiveKit had 66 outages in the last 2 years totaling 97h 31m of downtime — averaging 2.7 incidents per month.

There were 66 LiveKit outages since July 30, 2025 totaling 97h 31m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.livekit.io

Minor March 14, 2026

Increased participant subscription failures

Detected by Pingoru
Mar 14, 2026, 12:56 AM UTC
Resolved
Mar 13, 2026, 08:00 PM UTC
Duration
Timeline · 1 update
  1. resolved Mar 14, 2026, 12:56 AM UTC

    Between 21:00 and 21:15 UTC, a limited amount of participant subscriptions failed across regions, along with increased subscription latency to subscribers in the Chicago region

Read the full incident report →

Minor February 27, 2026

Analytics dashboard updates delayed

Detected by Pingoru
Feb 27, 2026, 12:18 AM UTC
Resolved
Feb 27, 2026, 02:40 AM UTC
Duration
2h 22m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 2 updates
  1. identified Feb 27, 2026, 12:18 AM UTC

    Updates to the LiveKit Cloud analytics dashboards are currently delayed due to an issue with our processing pipeline. Ingestion of new data is not affected, and we are in the process of recovering all data. We'll share another update once the restoration is complete and dashboards are updating normally.

  2. resolved Feb 27, 2026, 02:40 AM UTC

    Analytics dashboards are updating normally again.

Read the full incident report →

Notice February 20, 2026

Reports of SIP INVITES not getting responses in Chicago region

Detected by Pingoru
Feb 20, 2026, 09:47 PM UTC
Resolved
Feb 21, 2026, 04:54 AM UTC
Duration
7h 7m
Affected: US Central - SIP
Timeline · 4 updates
  1. investigating Feb 20, 2026, 09:47 PM UTC

    We are currently investigating user reports of SIP invites not getting responses in our Chicago region. We have not yet determined if the issue is due to the users' trunking providers, but we have begun routing traffic away from Chicago to our other US clusters to ensure continued service.

  2. monitoring Feb 20, 2026, 10:41 PM UTC

    The users who made the original reports are no longer seeing issues. We will continue investigating and post the results of our investigation as soon as they are available, but there are currently no known issues in LiveKit's SIP infrastructure.

  3. resolved Feb 21, 2026, 04:54 AM UTC

    We are going to keep traffic routing to our other US clusters, but we will consider this incident closed while we continue to investigate its origin as there is currently no impact to users.

  4. postmortem Mar 06, 2026, 09:20 AM UTC

    **Root Cause** This issue had the same root cause as the [incident](https://status.livekit.io/incidents/xkcpnmycy5m3) that occurred after it in the US east region. Please view that link for a better understanding of the root cause, technical details, monitoring and mitigations. **Timeline** 2026-02-20 19:00 UTC – Isolated customer reports came in of no sip responses to invites 2026-02-20 20:34 UTC – Looking at various customer examples, we figured out that the common factor in problematic calls was the chicago based SIP loadbalancer. 2026-02-20 21:26 UTC – SIP in Chicago was drained after ensuring it there was enough capacity in other regions.

Read the full incident report →

Minor January 23, 2026

Elevated API latency affecting Real Time Communication services in London Region

Detected by Pingoru
Jan 23, 2026, 02:39 PM UTC
Resolved
Jan 23, 2026, 10:24 PM UTC
Duration
7h 45m
Affected: United Kingdom - Real Time Communication
Timeline · 6 updates
  1. investigating Jan 23, 2026, 02:39 PM UTC

    We are currently investigating reports of elevated API latency in the UK region.

  2. identified Jan 23, 2026, 02:50 PM UTC

    We are routing traffic away from the London cluster to mitigate impact.

  3. identified Jan 23, 2026, 04:21 PM UTC

    We are continuing to investigate the root cause of the performance degradation of the London cluster. We will continue to share updates as we learn more.

  4. monitoring Jan 23, 2026, 06:38 PM UTC

    All non-pinned traffic has been healthy since we began routing traffic through the nearest healthy nodes at 14:50 UTC. Our team is still working to bring the London cluster back to full health.

  5. resolved Jan 23, 2026, 10:24 PM UTC

    We brought the London cluster back to full health at 21:40 UTC. This incident is now resolved. We will publish a postmortem outlining the root cause and the steps we will take to ameliorate this issue in the future as soon as possible.

  6. postmortem Jan 27, 2026, 09:35 PM UTC

    Our cloud provider’s managed Kubernetes control plane experienced an issue that caused connectivity to cluster nodes to be disrupted, dropping some of the workloads in the cluster. Due to a bug in our auto-draining logic, detection of the issue was delayed. We have since fixed the alerting to ensure faster detection and response if a similar situation occur should occur in the future.

Read the full incident report →

Minor January 22, 2026

Issues with a subset of inference requests to gemini models.

Detected by Pingoru
Jan 22, 2026, 06:29 PM UTC
Resolved
Jan 22, 2026, 07:46 PM UTC
Duration
1h 16m
Affected: Global Inference
Timeline · 3 updates
  1. identified Jan 22, 2026, 06:29 PM UTC

    We're investigating an issue with a subset of inference calls made to gemini models.

  2. monitoring Jan 22, 2026, 06:34 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Jan 22, 2026, 07:46 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor January 7, 2026

US East RTC: Short-Lived Connectivity Interruption

Detected by Pingoru
Jan 07, 2026, 04:00 PM UTC
Resolved
Jan 07, 2026, 04:00 PM UTC
Duration
Timeline · 1 update
  1. resolved Jan 07, 2026, 08:40 PM UTC

    Between 16:01 UTC and 16:05 UTC, ~9-10% of participants connecting to the US East region experienced disconnection/reconnection to the RTC service. Client SDKs automatically reconnect and the majority of affected users' access was restored by 16:05 UTC. Cause A code change from a new feature was the root cause of the issue. The new feature is behind a flag which is disabled by default. But, changes to code to support the new feature affected an existing path that did not implement the feature flag, and triggered its use by some mobile phone clients. Resolution The deployment that caused the issue was rolled back. The erroneous code path is being fixed and more tests will be added to ensure that it exhibits expected behavior.

Read the full incident report →

Major December 28, 2025

Connection errors from Chicago region

Detected by Pingoru
Dec 28, 2025, 12:29 AM UTC
Resolved
Dec 28, 2025, 12:37 AM UTC
Duration
8m
Affected: US Central - Real Time Communication
Timeline · 4 updates
  1. investigating Dec 28, 2025, 12:29 AM UTC

    we are investigating reports of connection failures originating from the Chicago region.

  2. identified Dec 28, 2025, 12:34 AM UTC

    the error is due to an expired certificate in the US Central cluster, we are routing traffic around the region

  3. resolved Dec 28, 2025, 12:37 AM UTC

    we've resolved the root cause of the expiring cert. service has been fully restored. this impacted core connectivity and API in the US Central (Chicago) region, including RTC, SIP, Phone numbers, Egress, and Ingress post mortem will follow shortly.

  4. postmortem Dec 28, 2025, 06:22 PM UTC

    The service we use to auto-renew certificates throttles aggressively. When a request to renew a cert is throttled or failed, we automatically retry to renew the certificate. This process usually resolves and the cert is renewed. However, in some cases the renewal process can get stuck; we have alerting set up to catch these instances. Unfortunately a logic error was introduced into this alert and in this case a stuck renewal process did not alert and the certificate eventually expired. We have validated that no other certs are currently in this state. We have also resolved the issue with the alerting. In the future, we'll ensure to monitor changes made to this alerting process to ensure this does not happen again.

Read the full incident report →

Minor December 11, 2025

Elevated setup times for Egress and SIP in Frankfurt region

Detected by Pingoru
Dec 11, 2025, 04:02 PM UTC
Resolved
Dec 12, 2025, 06:42 PM UTC
Duration
1d 2h
Affected: Europe Central - SIPEurope Central - Egress
Timeline · 2 updates
  1. investigating Dec 12, 2025, 07:37 PM UTC

    We are currently investigating reports of increased API response times for Egress and SIP in our Frankfurt region.

  2. resolved Dec 12, 2025, 11:49 PM UTC

    This incident has been resolved. p99 latency between services in the cluster and the DB instance in the eu-central region was elevated from a single cluster. Traffic has been routed around the cluster.

Read the full incident report →

Minor December 7, 2025

Analytics data is delayed

Detected by Pingoru
Dec 07, 2025, 08:07 PM UTC
Resolved
Dec 08, 2025, 03:10 PM UTC
Duration
19h 3m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 3 updates
  1. identified Dec 07, 2025, 08:07 PM UTC

    we are observing delays in the analytics data that's visible from the dashboard. we are working on processing the backlog of data. this impacts Session detail page, Egresses, Ingresses, and SIP dashboards. Sessions listing and Agent insights are not affected.

  2. identified Dec 08, 2025, 08:29 AM UTC

    we've implemented fixes to help the data processor to catch up. live processing will be enabled soon.

  3. resolved Dec 08, 2025, 03:10 PM UTC

    A fix has been implemented. The processing is back online, and will catch up to real time We will backfill the data for the affected duration next

Read the full incident report →

Minor December 5, 2025

Cloud Dashboard Outage Due to Cloudflare Outage

Detected by Pingoru
Dec 05, 2025, 09:00 AM UTC
Resolved
Dec 05, 2025, 09:30 AM UTC
Duration
30m
Timeline · 1 update
  1. resolved Dec 05, 2025, 09:37 AM UTC

    User traffic was not impacted, but our customer dashboard was not operational because of a major Cloudflare outage. It lasted roughly from 08:45 - 09:15 UTC. https://www.cloudflarestatus.com/incidents/k9ppxftx8bs5 has some details, though it appears the issue may have impacted Cloudflare caching as well. It appears Docker hub may have been impacted at the same time https://www.dockerstatus.com/pages/incident/533c6539221ae15e3f000031/6932a03b2276fb05ff7ee506 but at this point we see no production impact of this on Livekit.

Read the full incident report →

Notice December 4, 2025

Network instability in some Asia regions

Detected by Pingoru
Dec 04, 2025, 05:46 AM UTC
Resolved
Dec 03, 2025, 03:00 PM UTC
Duration
Timeline · 1 update
  1. resolved Dec 04, 2025, 05:46 AM UTC

    On 2025-12-03 from 15:06 to 15:26 UTC, network connectivity was flapping in our India, Israel, Saudi Arabia and UAE regions. During that period, attempts to initiate RTC connections or API requests to those regions would have failed intermittently. Clients failing to connect would have been routed to a nearby available region on a subsequent attempt. Clients would have experienced a slightly higher join time because of that.

Read the full incident report →

Minor November 18, 2025

Web page outage

Detected by Pingoru
Nov 18, 2025, 07:42 PM UTC
Resolved
Nov 18, 2025, 07:42 PM UTC
Duration
Timeline · 1 update
  1. resolved Nov 18, 2025, 07:42 PM UTC

    At 11:48 UTC a Cloudflare outage affected the ability to access LiveKit web pages, including www.livekit.io and our Cloud Dashboard (cloud.livekit.io). All other systems remained functional and were unaffected. End user services, including RTC, Ingress, Egress, Hosted Agents, and Inference Gateway remained functional throughout the outage. The issue was resolved by Cloudflare by 17:44 UTC.

Read the full incident report →

Minor November 6, 2025

EU Central - Cloud Agents creation issues

Detected by Pingoru
Nov 06, 2025, 03:39 PM UTC
Resolved
Nov 06, 2025, 10:03 PM UTC
Duration
6h 24m
Affected: Europe Central - Cloud Agents
Timeline · 4 updates
  1. investigating Nov 06, 2025, 03:39 PM UTC

    We are investigating intermittent issues with creating new cloud agents in EU Central. Running agents in this region are not affected.

  2. identified Nov 06, 2025, 04:38 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Nov 06, 2025, 07:28 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Nov 06, 2025, 10:03 PM UTC

    This incident has been resolved. An upstream service provider was having intermittent timeout issues. We've added retries to this interaction and we have not seen a reoccurrence.

Read the full incident report →

Minor October 31, 2025

Cloud Agents in us-east experiencing update issues

Detected by Pingoru
Oct 31, 2025, 12:37 PM UTC
Resolved
Oct 31, 2025, 02:35 PM UTC
Duration
1h 57m
Affected: US East - Cloud Agents
Timeline · 5 updates
  1. investigating Oct 31, 2025, 12:37 PM UTC

    We are currently investigating the issue.

  2. monitoring Oct 31, 2025, 01:04 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. identified Oct 31, 2025, 01:10 PM UTC

    The Reconciler service experienced an interruption, preventing certain changes (e.g., update-secret, deployments, etc.) from being applied to agents during the incident. Creation of new agents was not affected, and only a subset of existing agents were impacted.

  4. monitoring Oct 31, 2025, 01:10 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Oct 31, 2025, 02:35 PM UTC

    This incident has been resolved.

Read the full incident report →

Notice October 31, 2025

Frankfurt Cloud Provider Outage

Detected by Pingoru
Oct 31, 2025, 09:05 AM UTC
Resolved
Oct 31, 2025, 09:05 AM UTC
Duration
Affected: Europe Central - SIPEurope Central - TURNEurope Central - EgressEurope Central - IngressEurope Central - Real Time Communication
Timeline · 1 update
  1. resolved Oct 31, 2025, 09:05 AM UTC

    On 2025-10-31 from 06:42 to 06:55 UTC, we detected a cloud provider outage in Frankfurt. We automatically routed around this. After recovery, we noticed elevated RTT to other European clusters, so as a precaution we kept Frankfurt drained until the network proved stable. It was undrained at 09:08.

Read the full incident report →

Notice October 30, 2025

Saudi Arabia Cloud Provider Outage

Detected by Pingoru
Oct 30, 2025, 08:36 AM UTC
Resolved
Oct 30, 2025, 08:36 AM UTC
Duration
Affected: Saudi Arabia - IngressSaudi Arabia - SIPSaudi Arabia - EgressSaudi Arabia - TURNSaudi Arabia - Real Time CommunicationSaudi Arabia - Analytics Ingestion
Timeline · 1 update
  1. resolved Oct 30, 2025, 08:36 AM UTC

    Our Cloud Provider in Saudi Arabia went down from 7:30 - 8:20 UTC. We automatically route around these issues for unrestricted traffic, but customers with compliance requirements will have noticed a disruption of service.

Read the full incident report →

Minor October 22, 2025

Analytics dashboard is down

Detected by Pingoru
Oct 22, 2025, 02:05 AM UTC
Resolved
Oct 22, 2025, 05:10 AM UTC
Duration
3h 5m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 5 updates
  1. investigating Oct 22, 2025, 02:05 AM UTC

    We are currently investigating the issue. This affects only the analytics dashboard and api. Real time communications is operational

  2. identified Oct 22, 2025, 02:10 AM UTC

    The issue has been identified. Dashboard is now accessible. We are working on restarting the processing pipeline

  3. identified Oct 22, 2025, 02:10 AM UTC

    We are continuing to work on a fix for this issue.

  4. monitoring Oct 22, 2025, 04:49 AM UTC

    A fix has been implemented. We are continuing to monitor the situation

  5. resolved Oct 22, 2025, 05:10 AM UTC

    This incident has been resolved. We will backfill/restore the data

Read the full incident report →

Minor October 7, 2025

Issues with updating cloud agents

Detected by Pingoru
Oct 07, 2025, 06:06 PM UTC
Resolved
Oct 07, 2025, 09:25 PM UTC
Duration
3h 19m
Affected: Global Cloud Agents
Timeline · 4 updates
  1. investigating Oct 07, 2025, 06:06 PM UTC

    Issues with updating existing cloud agents. Running agents are still working as expected.

  2. identified Oct 07, 2025, 07:02 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Oct 07, 2025, 08:29 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Oct 07, 2025, 09:25 PM UTC

    This incident has been resolved.

Read the full incident report →