Buildkite incident

Increased error rates from Test Plan API

Major Resolved View vendor source →
Started
Mar 10, 2026, 01:21 AM UTC
Resolved
Mar 10, 2026, 09:34 AM UTC
Duration
8h 13m
Detected by Pingoru
Mar 10, 2026, 01:21 AM UTC

Affected components

REST API

Update timeline

  1. investigating Mar 10, 2026, 01:21 AM UTC

    We've observed periodic test splitting plan timing out and falling back to non-intelligent splitting. Performance appears to be back to normal as of an hour ago. We are continuing to investigate the root cause and solve the underlying issue.

  2. monitoring Mar 10, 2026, 02:25 AM UTC

    We have implemented several mitigation and continue working on fixing the underlying cause. Our team is actively monitoring the situation to ensure the stability. We will provide further updates as we make progress on resolving this issue.

  3. resolved Mar 10, 2026, 09:34 AM UTC

    Our mitigations have resolved the elevated latency and likelihood of suboptimal fallback test plans. We have also identified and fixed a blind-spot in our automated alerting, which was previously unable to detect this scenario as an issue. Work continues this week to resolve the underlying performance issue by restructuring how the relevant data is ingested and accessed.

Looking to track Buildkite downtime and outages?

Pingoru polls Buildkite's status page every 5 minutes and alerts you the moment it reports an issue — before your customers do.

  • Real-time alerts when Buildkite reports an incident
  • Email, Slack, Discord, Microsoft Teams, and webhook notifications
  • Track Buildkite alongside 5,000+ providers in one dashboard
  • Component-level filtering
  • Notification groups + maintenance calendar
Start monitoring Buildkite for free

5 free monitors · No credit card required