Buildkite incident

Latency issues

Major Resolved View vendor source →

Buildkite experienced a major incident on March 3, 2026 affecting Agent API and Job Queue, lasting 7h 32m. The incident has been resolved; the full update timeline is below.

Started
Mar 03, 2026, 09:51 PM UTC
Resolved
Mar 04, 2026, 05:24 AM UTC
Duration
7h 32m
Detected by Pingoru
Mar 03, 2026, 09:51 PM UTC

Affected components

Agent APIJob Queue

Update timeline

  1. investigating Mar 03, 2026, 09:51 PM UTC

    We're seeing elevated job dispatch latency and Agent API latency across multiple shards. We're investigating.

  2. investigating Mar 03, 2026, 10:41 PM UTC

    We're still experiencing latency issues for agent api and job dispatch. We continue to investigate and identify the root cause.

  3. investigating Mar 03, 2026, 11:21 PM UTC

    We continue to experience high latency on some services. We're continuing to identify root causes.

  4. monitoring Mar 04, 2026, 12:11 AM UTC

    We've made some changes to address the issue and are seeing signs of recovery. We continue to monitor the situation.

  5. monitoring Mar 04, 2026, 01:06 AM UTC

    We've seen a small number of unrelated issues, each affecting a subset of customers. Most impact is resolved, but we are continuing to monitor impact for a small number of remaining customers. We are in touch with those customers directly.

  6. monitoring Mar 04, 2026, 03:29 AM UTC

    We continue to observe high latency on isolated infrastructure serving Agent API endpoints for a subset of customers. We are provisioning additional capacity to address this latency, and have informed impacted customers.

  7. resolved Mar 04, 2026, 05:24 AM UTC

    We have completed the provisioning of additional capacity mentioned in our last update, and error rates and response times have returned to normal. This incident is now resolved.