LiveKit Outage History

LiveKit is up right now

LiveKit had 66 outages in the last 2 years totaling 31h 54m of downtime — averaging 2.7 incidents per month.

There were 66 LiveKit outages since July 30, 2025 totaling 31h 54m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.livekit.io

Minor October 6, 2025

issues with building new releases on cloud agents

Detected by Pingoru
Oct 06, 2025, 10:33 PM UTC
Resolved
Oct 06, 2025, 11:20 PM UTC
Duration
46m
Affected: Global Cloud Agents
Timeline · 4 updates
  1. investigating Oct 06, 2025, 10:33 PM UTC

    We are currently investigating the issue. Running cloud agents are not affected, only new builds.

  2. monitoring Oct 06, 2025, 10:46 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Oct 06, 2025, 11:20 PM UTC

    This incident has been resolved.

  4. postmortem Oct 06, 2025, 11:20 PM UTC

    During a routine cloud-agents control plane release, a configuration error caused the service to crash. We quickly identified and corrected the problematic config value, restoring normal operation. We are investigating ways to make our configs more robust and will look to add additional testing around this process to ensure the issue doesn’t happen again.

Read the full incident report →

Major October 2, 2025

Networking issues in Tokyo region

Detected by Pingoru
Oct 02, 2025, 10:41 AM UTC
Resolved
Oct 02, 2025, 11:05 AM UTC
Duration
24m
Affected: Global Real Time Communication
Timeline · 4 updates
  1. investigating Oct 02, 2025, 10:41 AM UTC

    We've discovered networking issues in cloud provider in Tokyo. We're routing around it.

  2. investigating Oct 02, 2025, 10:44 AM UTC

    We are continuing to investigate this issue.

  3. monitoring Oct 02, 2025, 10:57 AM UTC

    We have mitigated by routing traffic around the affected region and continue to monitor.

  4. resolved Oct 02, 2025, 11:05 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor September 25, 2025

Analytics data processing pipeline is currently offline

Detected by Pingoru
Sep 25, 2025, 11:12 AM UTC
Resolved
Sep 25, 2025, 07:40 PM UTC
Duration
8h 27m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 3 updates
  1. identified Sep 25, 2025, 11:12 AM UTC

    Analytics data processing pipeline is currently down. We are working on recovering the processing pipeline. Cloud dashboard and analytics api will be missing data during this time

  2. monitoring Sep 25, 2025, 03:00 PM UTC

    A fix has been deployed and we are monitoring.

  3. resolved Sep 25, 2025, 07:40 PM UTC

    This incident has been resolved. We will work on the recovering the data for which duration the processing was down

Read the full incident report →

Minor September 25, 2025

dashboard data is delayed

Detected by Pingoru
Sep 25, 2025, 07:00 AM UTC
Resolved
Sep 24, 2025, 07:00 AM UTC
Duration
Timeline · 1 update
  1. resolved Sep 25, 2025, 04:40 AM UTC

    We've discovered a data pipeline issue causing Cloud dashboard data to be delayed. we are investigating the issue and will be backfilling the data.

Read the full incident report →

Minor September 25, 2025

Limited availability of egress service due to docker hub outage

Detected by Pingoru
Sep 25, 2025, 12:41 AM UTC
Resolved
Sep 25, 2025, 01:02 AM UTC
Duration
20m
Affected: Global Egress
Timeline · 3 updates
  1. investigating Sep 25, 2025, 12:41 AM UTC

    An outage in the our docker image registry provider (docker hub) is preventing us to scale up the egress service, causing limited availability. We are working on potential mitigations and monitoring the status of docker hub

  2. identified Sep 25, 2025, 12:45 AM UTC

    An outage in the our docker image registry provider (docker hub) is preventing us to scale up the egress service, causing limited availability. We are working on potential mitigations and monitoring the status of docker hub

  3. resolved Sep 25, 2025, 01:02 AM UTC

    The docker hub outage is resolved, allowing us to scale up as expected again.

Read the full incident report →

Minor September 22, 2025

SIP call outage in EU

Detected by Pingoru
Sep 22, 2025, 08:13 AM UTC
Resolved
Sep 22, 2025, 11:30 AM UTC
Duration
3h 17m
Affected: Global SIP
Timeline · 6 updates
  1. investigating Sep 22, 2025, 08:13 AM UTC

    We are currently investigating reports of SIP calls into EU region not working

  2. identified Sep 22, 2025, 09:53 AM UTC

    The issue has been identified and a fix is being implemented.

  3. identified Sep 22, 2025, 10:30 AM UTC

    The fix has been deployed to all regions.

  4. monitoring Sep 22, 2025, 11:07 AM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Sep 22, 2025, 11:30 AM UTC

    The incident is resolved.

  6. postmortem Sep 23, 2025, 04:35 AM UTC

    On Friday, Sep 19, 2025 at 11.21pm UTC, we rolled out a change to our SIP gateway servers that reduced the allowed MTU size for incoming packets to 1500 bytes. While this did not affect all calls, 5% of calls where INVITE messages exceeded the SIP mandated MTU size of 1500 got dropped. These may have appeared like network issues to any affected users. The fix was to increase the MTU size limit on the SIP gateway and this resolved the issue. To prevent similar incidents in the future, we have already added to both our monitoring and our release verification suite to cover this case.

Read the full incident report →

Minor September 12, 2025

Cloud agent storage and deploy issue

Detected by Pingoru
Sep 12, 2025, 04:54 AM UTC
Resolved
Sep 12, 2025, 07:26 AM UTC
Duration
2h 32m
Affected: Global Cloud Agents
Timeline · 3 updates
  1. identified Sep 12, 2025, 04:54 AM UTC

    We've identified an issue with cloud agent build and deployments. Fix is in progress. Note the issue only impacts new builds. Existing agents continue to work as expected.

  2. monitoring Sep 12, 2025, 06:43 AM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Sep 12, 2025, 07:26 AM UTC

    This incident has been resolved.

Read the full incident report →

Notice September 10, 2025

2.5% of Inbound UDP Calls Dropping

Detected by Pingoru
Sep 10, 2025, 05:54 PM UTC
Resolved
Aug 26, 2025, 05:00 AM UTC
Duration
Timeline · 1 update
  1. resolved Sep 10, 2025, 05:54 PM UTC

    Our SIP load balancer runs inside of Kubernetes, which internally routes packets from the virtual pod IP to the host IP. We believe a linux kernel bug related to conntrack is causing certain packets (including our 200 OK responses) to be lost in that process. This bug impacts about 2.5% of inbound calls over UDP. Since we sent the 200, but it doesn't make it to the trunking provider, our system believes the call is connected when it isn't. Since TCP retries lost packets, it is not impacted by this bug. We will be rolling out a permanent fix in the next few weeks.

Read the full incident report →

Minor September 10, 2025

Cloud Agents deploy issues

Detected by Pingoru
Sep 10, 2025, 05:09 PM UTC
Resolved
Sep 10, 2025, 07:27 PM UTC
Duration
2h 18m
Affected: Global Cloud Agents
Timeline · 5 updates
  1. investigating Sep 10, 2025, 05:09 PM UTC

    Some cloud agent builds are having problems getting scheduled; we're investigating this issue.

  2. identified Sep 10, 2025, 05:47 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Sep 10, 2025, 06:37 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Sep 10, 2025, 07:27 PM UTC

    This incident has been resolved.

  5. postmortem Sep 10, 2025, 07:28 PM UTC

    This issue was caused by lock contention in the cloud agents deployment code path. This caused some builds to not get deployed in a timely manner. The offending lock scope has been decreased significantly which should ensure this issue doesn’t happen again. We’ve also added additional monitoring around the queue involved to ensure we are notified earlier of any similar issues.

Read the full incident report →

Notice August 26, 2025

Degradation in Egress, Ingress and SIP APIs

Detected by Pingoru
Aug 26, 2025, 04:59 PM UTC
Resolved
Aug 26, 2025, 08:09 PM UTC
Duration
3h 10m
Affected: Global SIP
Timeline · 2 updates
  1. monitoring Aug 26, 2025, 04:59 PM UTC

    We experienced spikes of slow db queries around 1600 and 1700 UTC Aug 26, 2025 that affected the performance of the Egress, Ingress and SIP APIs in US West region . We rolled out a fix at 1734 UTC Aug 26, 2025 and are now monitoring the situation.

  2. resolved Aug 26, 2025, 08:09 PM UTC

    This incident has been resolved.

Read the full incident report →

Major August 26, 2025

transient disruption in US West and Brazil regions

Detected by Pingoru
Aug 26, 2025, 05:23 AM UTC
Resolved
Aug 26, 2025, 04:00 AM UTC
Duration
Timeline · 1 update
  1. resolved Aug 26, 2025, 05:23 AM UTC

    The load balancers in US West and Brazil regions became overloaded and was rejecting a large number of connections between 4:01 UTC to 4:04 UTC, lasting around 3 minutes During this time, some API requests and user connections in those regions were unavailable.

Read the full incident report →

Notice August 22, 2025

Temporary Egress Availability Issues

Detected by Pingoru
Aug 22, 2025, 04:49 PM UTC
Resolved
Aug 22, 2025, 08:00 PM UTC
Duration
3h 10m
Timeline · 1 update
  1. resolved Aug 22, 2025, 04:49 PM UTC

    From 15:58-16:18 UTC, around 2% of StartEgress requests failed with 503 service unavailable. The root cause was an issue with our autoscaling metrics, causing our canary clusters to stop scaling. The change has been reverted and service is now back to normal.

Read the full incident report →

Minor August 5, 2025

Global analytics processing issue detected

Detected by Pingoru
Aug 05, 2025, 09:35 AM UTC
Resolved
Aug 05, 2025, 10:28 AM UTC
Duration
53m
Timeline · 2 updates
  1. investigating Aug 05, 2025, 09:35 AM UTC

    We have detected an issue with global analytics and are currently investigating.

  2. resolved Aug 05, 2025, 10:28 AM UTC

    The issue has been resolved. Our processing pipeline is working through the backlog and will catch up shortly.

Read the full incident report →

Notice August 2, 2025

Temporary networking disruption in US East 1

Detected by Pingoru
Aug 02, 2025, 09:41 AM UTC
Resolved
Aug 01, 2025, 04:00 PM UTC
Duration
Timeline · 1 update
  1. resolved Aug 02, 2025, 09:41 AM UTC

    Our alarms picked up higher than normal error rates with various RTC API calls. The disruption lasted between 16:05 to 16:09 UTC, impacting a portion of API requests in US East 1. We have root caused this incident to be due to internal networking within the data center. It has recovered within a few minutes without intervention.

Read the full incident report →

Minor July 30, 2025

Analytics processing is dealyed

Detected by Pingoru
Jul 30, 2025, 06:57 AM UTC
Resolved
Jul 30, 2025, 07:33 AM UTC
Duration
36m
Affected: Cloud Dashboard (cloud.livekit.io)
Timeline · 3 updates
  1. identified Jul 30, 2025, 06:57 AM UTC

    The issue has been identified, we are rolling out a fix

  2. monitoring Jul 30, 2025, 07:10 AM UTC

    A fix has been deployed and we are monitoring the situation

  3. resolved Jul 30, 2025, 07:33 AM UTC

    This incident has been resolved

Read the full incident report →