- Detected by Pingoru
- Mar 14, 2026, 12:56 AM UTC
- Resolved
- Mar 13, 2026, 08:00 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Mar 14, 2026, 12:56 AM UTC
Between 21:00 and 21:15 UTC, a limited amount of participant subscriptions failed across regions, along with increased subscription latency to subscribers in the Chicago region
Read the full incident report →
- Detected by Pingoru
- Mar 02, 2026, 10:30 AM UTC
- Resolved
- Mar 02, 2026, 10:30 AM UTC
- Duration
- —
Timeline · 2 updates
Read the full incident report →
- Detected by Pingoru
- Feb 27, 2026, 12:18 AM UTC
- Resolved
- Feb 27, 2026, 02:40 AM UTC
- Duration
- 2h 22m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 2 updates
-
identified Feb 27, 2026, 12:18 AM UTC
Updates to the LiveKit Cloud analytics dashboards are currently delayed due to an issue with our processing pipeline. Ingestion of new data is not affected, and we are in the process of recovering all data. We'll share another update once the restoration is complete and dashboards are updating normally.
-
resolved Feb 27, 2026, 02:40 AM UTC
Analytics dashboards are updating normally again.
Read the full incident report →
- Detected by Pingoru
- Feb 20, 2026, 09:47 PM UTC
- Resolved
- Feb 21, 2026, 04:54 AM UTC
- Duration
- 7h 7m
Affected: US Central - SIP
Timeline · 4 updates
-
investigating Feb 20, 2026, 09:47 PM UTC
We are currently investigating user reports of SIP invites not getting responses in our Chicago region. We have not yet determined if the issue is due to the users' trunking providers, but we have begun routing traffic away from Chicago to our other US clusters to ensure continued service.
-
monitoring Feb 20, 2026, 10:41 PM UTC
The users who made the original reports are no longer seeing issues. We will continue investigating and post the results of our investigation as soon as they are available, but there are currently no known issues in LiveKit's SIP infrastructure.
-
resolved Feb 21, 2026, 04:54 AM UTC
We are going to keep traffic routing to our other US clusters, but we will consider this incident closed while we continue to investigate its origin as there is currently no impact to users.
-
postmortem Mar 06, 2026, 09:20 AM UTC
**Root Cause** This issue had the same root cause as the [incident](https://status.livekit.io/incidents/xkcpnmycy5m3) that occurred after it in the US east region. Please view that link for a better understanding of the root cause, technical details, monitoring and mitigations. **Timeline** 2026-02-20 19:00 UTC – Isolated customer reports came in of no sip responses to invites 2026-02-20 20:34 UTC – Looking at various customer examples, we figured out that the common factor in problematic calls was the chicago based SIP loadbalancer. 2026-02-20 21:26 UTC – SIP in Chicago was drained after ensuring it there was enough capacity in other regions.
Read the full incident report →
- Detected by Pingoru
- Feb 20, 2026, 09:12 AM UTC
- Resolved
- Feb 14, 2026, 04:00 PM UTC
- Duration
- —
Timeline · 2 updates
Read the full incident report →
- Detected by Pingoru
- Jan 28, 2026, 04:45 PM UTC
- Resolved
- Jan 28, 2026, 09:45 PM UTC
- Duration
- 4h 59m
Affected: Europe Central - SIPEurope South - SIPUnited Kingdom - SIP
Timeline · 8 updates
Read the full incident report →
- Detected by Pingoru
- Jan 23, 2026, 02:39 PM UTC
- Resolved
- Jan 23, 2026, 10:24 PM UTC
- Duration
- 7h 45m
Affected: United Kingdom - Real Time Communication
Timeline · 6 updates
-
investigating Jan 23, 2026, 02:39 PM UTC
We are currently investigating reports of elevated API latency in the UK region.
-
identified Jan 23, 2026, 02:50 PM UTC
We are routing traffic away from the London cluster to mitigate impact.
-
identified Jan 23, 2026, 04:21 PM UTC
We are continuing to investigate the root cause of the performance degradation of the London cluster. We will continue to share updates as we learn more.
-
monitoring Jan 23, 2026, 06:38 PM UTC
All non-pinned traffic has been healthy since we began routing traffic through the nearest healthy nodes at 14:50 UTC. Our team is still working to bring the London cluster back to full health.
-
resolved Jan 23, 2026, 10:24 PM UTC
We brought the London cluster back to full health at 21:40 UTC. This incident is now resolved. We will publish a postmortem outlining the root cause and the steps we will take to ameliorate this issue in the future as soon as possible.
-
postmortem Jan 27, 2026, 09:35 PM UTC
Our cloud provider’s managed Kubernetes control plane experienced an issue that caused connectivity to cluster nodes to be disrupted, dropping some of the workloads in the cluster. Due to a bug in our auto-draining logic, detection of the issue was delayed. We have since fixed the alerting to ensure faster detection and response if a similar situation occur should occur in the future.
Read the full incident report →
- Detected by Pingoru
- Jan 22, 2026, 06:29 PM UTC
- Resolved
- Jan 22, 2026, 07:46 PM UTC
- Duration
- 1h 16m
Affected: Global Inference
Timeline · 3 updates
-
identified Jan 22, 2026, 06:29 PM UTC
We're investigating an issue with a subset of inference calls made to gemini models.
-
monitoring Jan 22, 2026, 06:34 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Jan 22, 2026, 07:46 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jan 09, 2026, 10:00 AM UTC
- Resolved
- Jan 09, 2026, 10:24 AM UTC
- Duration
- 24m
Affected: US West - SIPUS East - SIPUS Central - SIP
Timeline · 5 updates
Read the full incident report →
- Detected by Pingoru
- Jan 07, 2026, 04:00 PM UTC
- Resolved
- Jan 07, 2026, 04:00 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Jan 07, 2026, 08:40 PM UTC
Between 16:01 UTC and 16:05 UTC, ~9-10% of participants connecting to the US East region experienced disconnection/reconnection to the RTC service. Client SDKs automatically reconnect and the majority of affected users' access was restored by 16:05 UTC. Cause A code change from a new feature was the root cause of the issue. The new feature is behind a flag which is disabled by default. But, changes to code to support the new feature affected an existing path that did not implement the feature flag, and triggered its use by some mobile phone clients. Resolution The deployment that caused the issue was rolled back. The erroneous code path is being fixed and more tests will be added to ensure that it exhibits expected behavior.
Read the full incident report →
- Detected by Pingoru
- Dec 31, 2025, 07:24 PM UTC
- Resolved
- Jan 01, 2026, 04:02 AM UTC
- Duration
- 8h 37m
Affected: Global SIP
Timeline · 4 updates
Read the full incident report →
- Detected by Pingoru
- Dec 28, 2025, 12:29 AM UTC
- Resolved
- Dec 28, 2025, 12:37 AM UTC
- Duration
- 8m
Affected: US Central - Real Time Communication
Timeline · 4 updates
-
investigating Dec 28, 2025, 12:29 AM UTC
we are investigating reports of connection failures originating from the Chicago region.
-
identified Dec 28, 2025, 12:34 AM UTC
the error is due to an expired certificate in the US Central cluster, we are routing traffic around the region
-
resolved Dec 28, 2025, 12:37 AM UTC
we've resolved the root cause of the expiring cert. service has been fully restored. this impacted core connectivity and API in the US Central (Chicago) region, including RTC, SIP, Phone numbers, Egress, and Ingress post mortem will follow shortly.
-
postmortem Dec 28, 2025, 06:22 PM UTC
The service we use to auto-renew certificates throttles aggressively. When a request to renew a cert is throttled or failed, we automatically retry to renew the certificate. This process usually resolves and the cert is renewed. However, in some cases the renewal process can get stuck; we have alerting set up to catch these instances. Unfortunately a logic error was introduced into this alert and in this case a stuck renewal process did not alert and the certificate eventually expired. We have validated that no other certs are currently in this state. We have also resolved the issue with the alerting. In the future, we'll ensure to monitor changes made to this alerting process to ensure this does not happen again.
Read the full incident report →
- Detected by Pingoru
- Dec 11, 2025, 04:02 PM UTC
- Resolved
- Dec 12, 2025, 06:42 PM UTC
- Duration
- 1d 2h
Affected: Europe Central - SIPEurope Central - Egress
Timeline · 2 updates
-
investigating Dec 12, 2025, 07:37 PM UTC
We are currently investigating reports of increased API response times for Egress and SIP in our Frankfurt region.
-
resolved Dec 12, 2025, 11:49 PM UTC
This incident has been resolved. p99 latency between services in the cluster and the DB instance in the eu-central region was elevated from a single cluster. Traffic has been routed around the cluster.
Read the full incident report →
- Detected by Pingoru
- Dec 07, 2025, 08:07 PM UTC
- Resolved
- Dec 08, 2025, 03:10 PM UTC
- Duration
- 19h 3m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 3 updates
-
identified Dec 07, 2025, 08:07 PM UTC
we are observing delays in the analytics data that's visible from the dashboard. we are working on processing the backlog of data. this impacts Session detail page, Egresses, Ingresses, and SIP dashboards. Sessions listing and Agent insights are not affected.
-
identified Dec 08, 2025, 08:29 AM UTC
we've implemented fixes to help the data processor to catch up. live processing will be enabled soon.
-
resolved Dec 08, 2025, 03:10 PM UTC
A fix has been implemented. The processing is back online, and will catch up to real time We will backfill the data for the affected duration next
Read the full incident report →
- Detected by Pingoru
- Dec 05, 2025, 09:00 AM UTC
- Resolved
- Dec 05, 2025, 09:30 AM UTC
- Duration
- 30m
Timeline · 1 update
-
resolved Dec 05, 2025, 09:37 AM UTC
User traffic was not impacted, but our customer dashboard was not operational because of a major Cloudflare outage. It lasted roughly from 08:45 - 09:15 UTC. https://www.cloudflarestatus.com/incidents/k9ppxftx8bs5 has some details, though it appears the issue may have impacted Cloudflare caching as well. It appears Docker hub may have been impacted at the same time https://www.dockerstatus.com/pages/incident/533c6539221ae15e3f000031/6932a03b2276fb05ff7ee506 but at this point we see no production impact of this on Livekit.
Read the full incident report →
- Detected by Pingoru
- Dec 04, 2025, 05:46 AM UTC
- Resolved
- Dec 03, 2025, 03:00 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Dec 04, 2025, 05:46 AM UTC
On 2025-12-03 from 15:06 to 15:26 UTC, network connectivity was flapping in our India, Israel, Saudi Arabia and UAE regions. During that period, attempts to initiate RTC connections or API requests to those regions would have failed intermittently. Clients failing to connect would have been routed to a nearby available region on a subsequent attempt. Clients would have experienced a slightly higher join time because of that.
Read the full incident report →
- Detected by Pingoru
- Nov 18, 2025, 07:42 PM UTC
- Resolved
- Nov 18, 2025, 07:42 PM UTC
- Duration
- —
Timeline · 1 update
-
resolved Nov 18, 2025, 07:42 PM UTC
At 11:48 UTC a Cloudflare outage affected the ability to access LiveKit web pages, including www.livekit.io and our Cloud Dashboard (cloud.livekit.io). All other systems remained functional and were unaffected. End user services, including RTC, Ingress, Egress, Hosted Agents, and Inference Gateway remained functional throughout the outage. The issue was resolved by Cloudflare by 17:44 UTC.
Read the full incident report →
- Detected by Pingoru
- Nov 10, 2025, 04:29 PM UTC
- Resolved
- Nov 10, 2025, 06:30 PM UTC
- Duration
- 2h 1m
Affected: US West - EgressUS East - EgressUS Central - Egress
Timeline · 3 updates
Read the full incident report →
- Detected by Pingoru
- Nov 06, 2025, 03:39 PM UTC
- Resolved
- Nov 06, 2025, 10:03 PM UTC
- Duration
- 6h 24m
Affected: Europe Central - Cloud Agents
Timeline · 4 updates
-
investigating Nov 06, 2025, 03:39 PM UTC
We are investigating intermittent issues with creating new cloud agents in EU Central. Running agents in this region are not affected.
-
identified Nov 06, 2025, 04:38 PM UTC
The issue has been identified and a fix is being implemented.
-
monitoring Nov 06, 2025, 07:28 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Nov 06, 2025, 10:03 PM UTC
This incident has been resolved. An upstream service provider was having intermittent timeout issues. We've added retries to this interaction and we have not seen a reoccurrence.
Read the full incident report →
- Detected by Pingoru
- Oct 31, 2025, 12:37 PM UTC
- Resolved
- Oct 31, 2025, 02:35 PM UTC
- Duration
- 1h 57m
Affected: US East - Cloud Agents
Timeline · 5 updates
-
investigating Oct 31, 2025, 12:37 PM UTC
We are currently investigating the issue.
-
monitoring Oct 31, 2025, 01:04 PM UTC
A fix has been implemented and we are monitoring the results.
-
identified Oct 31, 2025, 01:10 PM UTC
The Reconciler service experienced an interruption, preventing certain changes (e.g., update-secret, deployments, etc.) from being applied to agents during the incident. Creation of new agents was not affected, and only a subset of existing agents were impacted.
-
monitoring Oct 31, 2025, 01:10 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Oct 31, 2025, 02:35 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Oct 31, 2025, 09:05 AM UTC
- Resolved
- Oct 31, 2025, 09:05 AM UTC
- Duration
- —
Affected: Europe Central - SIPEurope Central - TURNEurope Central - EgressEurope Central - IngressEurope Central - Real Time Communication
Timeline · 1 update
-
resolved Oct 31, 2025, 09:05 AM UTC
On 2025-10-31 from 06:42 to 06:55 UTC, we detected a cloud provider outage in Frankfurt. We automatically routed around this. After recovery, we noticed elevated RTT to other European clusters, so as a precaution we kept Frankfurt drained until the network proved stable. It was undrained at 09:08.
Read the full incident report →
- Detected by Pingoru
- Oct 30, 2025, 08:36 AM UTC
- Resolved
- Oct 30, 2025, 08:36 AM UTC
- Duration
- —
Affected: Saudi Arabia - IngressSaudi Arabia - SIPSaudi Arabia - EgressSaudi Arabia - TURNSaudi Arabia - Real Time CommunicationSaudi Arabia - Analytics Ingestion
Timeline · 1 update
-
resolved Oct 30, 2025, 08:36 AM UTC
Our Cloud Provider in Saudi Arabia went down from 7:30 - 8:20 UTC. We automatically route around these issues for unrestricted traffic, but customers with compliance requirements will have noticed a disruption of service.
Read the full incident report →
- Detected by Pingoru
- Oct 28, 2025, 06:26 PM UTC
- Resolved
- Oct 28, 2025, 08:15 PM UTC
- Duration
- 1h 48m
Affected: Global SIP
Timeline · 4 updates
Read the full incident report →
- Detected by Pingoru
- Oct 22, 2025, 02:05 AM UTC
- Resolved
- Oct 22, 2025, 05:10 AM UTC
- Duration
- 3h 5m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 5 updates
-
investigating Oct 22, 2025, 02:05 AM UTC
We are currently investigating the issue. This affects only the analytics dashboard and api. Real time communications is operational
-
identified Oct 22, 2025, 02:10 AM UTC
The issue has been identified. Dashboard is now accessible. We are working on restarting the processing pipeline
-
identified Oct 22, 2025, 02:10 AM UTC
We are continuing to work on a fix for this issue.
-
monitoring Oct 22, 2025, 04:49 AM UTC
A fix has been implemented. We are continuing to monitor the situation
-
resolved Oct 22, 2025, 05:10 AM UTC
This incident has been resolved. We will backfill/restore the data
Read the full incident report →
- Detected by Pingoru
- Oct 07, 2025, 06:06 PM UTC
- Resolved
- Oct 07, 2025, 09:25 PM UTC
- Duration
- 3h 19m
Affected: Global Cloud Agents
Timeline · 4 updates
-
investigating Oct 07, 2025, 06:06 PM UTC
Issues with updating existing cloud agents. Running agents are still working as expected.
-
identified Oct 07, 2025, 07:02 PM UTC
The issue has been identified and a fix is being implemented.
-
monitoring Oct 07, 2025, 08:29 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Oct 07, 2025, 09:25 PM UTC
This incident has been resolved.
Read the full incident report →