LiveKit incident

Connection errors from Chicago region

Major Resolved View vendor source →

LiveKit experienced a major incident on December 28, 2025 affecting US Central - Real Time Communication, lasting 8m. The incident has been resolved; the full update timeline is below.

Started
Dec 28, 2025, 12:29 AM UTC
Resolved
Dec 28, 2025, 12:37 AM UTC
Duration
8m
Detected by Pingoru
Dec 28, 2025, 12:29 AM UTC

Affected components

US Central - Real Time Communication

Update timeline

  1. investigating Dec 28, 2025, 12:29 AM UTC

    we are investigating reports of connection failures originating from the Chicago region.

  2. identified Dec 28, 2025, 12:34 AM UTC

    the error is due to an expired certificate in the US Central cluster, we are routing traffic around the region

  3. resolved Dec 28, 2025, 12:37 AM UTC

    we've resolved the root cause of the expiring cert. service has been fully restored. this impacted core connectivity and API in the US Central (Chicago) region, including RTC, SIP, Phone numbers, Egress, and Ingress post mortem will follow shortly.

  4. postmortem Dec 28, 2025, 06:22 PM UTC

    The service we use to auto-renew certificates throttles aggressively. When a request to renew a cert is throttled or failed, we automatically retry to renew the certificate. This process usually resolves and the cert is renewed. However, in some cases the renewal process can get stuck; we have alerting set up to catch these instances. Unfortunately a logic error was introduced into this alert and in this case a stuck renewal process did not alert and the certificate eventually expired. We have validated that no other certs are currently in this state. We have also resolved the issue with the alerting. In the future, we'll ensure to monitor changes made to this alerting process to ensure this does not happen again.