LiveKit incident

Cloud Agents deploy issues

Minor Resolved View vendor source →

LiveKit experienced a minor incident on September 10, 2025 affecting Global Cloud Agents, lasting 2h 18m. The incident has been resolved; the full update timeline is below.

Started
Sep 10, 2025, 05:09 PM UTC
Resolved
Sep 10, 2025, 07:27 PM UTC
Duration
2h 18m
Detected by Pingoru
Sep 10, 2025, 05:09 PM UTC

Affected components

Global Cloud Agents

Update timeline

  1. investigating Sep 10, 2025, 05:09 PM UTC

    Some cloud agent builds are having problems getting scheduled; we're investigating this issue.

  2. identified Sep 10, 2025, 05:47 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Sep 10, 2025, 06:37 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Sep 10, 2025, 07:27 PM UTC

    This incident has been resolved.

  5. postmortem Sep 10, 2025, 07:28 PM UTC

    This issue was caused by lock contention in the cloud agents deployment code path. This caused some builds to not get deployed in a timely manner. The offending lock scope has been decreased significantly which should ensure this issue doesn’t happen again. We’ve also added additional monitoring around the queue involved to ensure we are notified earlier of any similar issues.