LiveKit experienced a major incident on December 28, 2025 affecting US Central - Real Time Communication, lasting 8m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Dec 28, 2025, 12:29 AM UTC
we are investigating reports of connection failures originating from the Chicago region.
- identified Dec 28, 2025, 12:34 AM UTC
the error is due to an expired certificate in the US Central cluster, we are routing traffic around the region
- resolved Dec 28, 2025, 12:37 AM UTC
we've resolved the root cause of the expiring cert. service has been fully restored. this impacted core connectivity and API in the US Central (Chicago) region, including RTC, SIP, Phone numbers, Egress, and Ingress post mortem will follow shortly.
- postmortem Dec 28, 2025, 06:22 PM UTC
The service we use to auto-renew certificates throttles aggressively. When a request to renew a cert is throttled or failed, we automatically retry to renew the certificate. This process usually resolves and the cert is renewed. However, in some cases the renewal process can get stuck; we have alerting set up to catch these instances. Unfortunately a logic error was introduced into this alert and in this case a stuck renewal process did not alert and the certificate eventually expired. We have validated that no other certs are currently in this state. We have also resolved the issue with the alerting. In the future, we'll ensure to monitor changes made to this alerting process to ensure this does not happen again.