Braintrust Outage History

Braintrust is up right now

Braintrust had 27 outages in the last 2 years totaling 63h 23m of downtime — averaging 1.1 incidents per month.

There were 27 Braintrust outages since July 7, 2025 totaling 63h 23m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.braintrust.dev

Minor June 12, 2026

US Data plane outage

Detected by Pingoru
Jun 12, 2026, 06:26 PM UTC
Resolved
Jun 12, 2026, 06:26 PM UTC
Duration
Affected: AI GatewayWeb UI / Control Plane
Timeline · 2 updates
  1. investigating Jun 12, 2026, 06:16 PM UTC

    Status: Investigating Our US data plane is encountering issues. Affected components Centrally-Hosted Data Plane (US) (Full outage) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)

  2. resolved Jun 12, 2026, 06:26 PM UTC

    Status: Resolved A configuration change was identified as the root cause and reverted. Affected components Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)

Read the full incident report →

Minor June 4, 2026

Issues loading logs page

Detected by Pingoru
Jun 04, 2026, 11:07 PM UTC
Resolved
Jun 04, 2026, 11:07 PM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 2 updates
  1. investigating Jun 04, 2026, 10:51 PM UTC

    Status: Investigating We're experiencing issues loading logs and some other pages in Braintrust. Our team is actively investigating and working to resolve. Affected components Web UI / Control Plane (Partial outage)

  2. resolved Jun 04, 2026, 11:07 PM UTC

    Status: Resolved We have pushed a fix and the loading issues have been remediated. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor May 26, 2026

Errors logging into Braintrust

Detected by Pingoru
May 26, 2026, 10:11 PM UTC
Resolved
May 26, 2026, 10:11 PM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 3 updates
  1. investigating May 26, 2026, 09:33 PM UTC

    Status: Investigating Our upstream auth provider is having issues. Fresh logins to Braintrust will fail with an error. Affected components Web UI / Control Plane (Partial outage)

  2. investigating May 26, 2026, 10:10 PM UTC

    Status: Investigating Our auth provider has recovered. Affected components Web UI / Control Plane (Operational)

  3. resolved May 26, 2026, 10:11 PM UTC

    Status: Resolved Our auth provider has recovered. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor May 16, 2026

Automation degradation

Detected by Pingoru
May 16, 2026, 06:50 PM UTC
Resolved
May 16, 2026, 06:50 PM UTC
Duration
Affected: AI GatewayWeb UI / Control Plane
Timeline · 2 updates
  1. investigating May 15, 2026, 08:00 PM UTC

    Status: Investigating On May 15, some automations stopped running which impacted scorer executions. Affected components Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Degraded performance) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)

  2. resolved May 16, 2026, 06:50 PM UTC

    Status: Resolved We determined the issue was caused by a role used by the automation service losing required permissions following a configuration change. The issue was reported on May 15 at approximately 1:00 PM PT and resolved on May 16 at 11:50 AM PT. To help prevent similar incidents, we have improved our alerting and added audit logging for role and permission changes. Affected components Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)

Read the full incident report →

Minor May 6, 2026

High rate of errors on control plane

Detected by Pingoru
May 06, 2026, 12:41 AM UTC
Resolved
May 06, 2026, 12:41 AM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 5 updates
  1. investigating May 05, 2026, 11:35 PM UTC

    Status: Investigating We're observing a high rate of errors on the production control plane which is causing errors on braintrust.dev and data plane operations. Affected components Web UI / Control Plane (Partial outage)

  2. identified May 06, 2026, 12:08 AM UTC

    Status: Identified We have identified the cause and are working with an upstream provider on a fix. Affected components Web UI / Control Plane (Partial outage)

  3. monitoring May 06, 2026, 12:10 AM UTC

    Status: Monitoring We are actively monitoring the fix. Affected components Web UI / Control Plane (Partial outage)

  4. resolved May 06, 2026, 12:23 AM UTC

    Status: Resolved We are no longer seeing issues. Affected components Web UI / Control Plane (Operational)

  5. resolved May 06, 2026, 12:41 AM UTC

    Status: Resolved The issue was caused by our firewall provider unexpectedly blocking traffic despite safeguards we had in place to prevent this behavior. We are actively working with the vendor on additional mitigation measures to reduce the likelihood of this occurring again. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor April 26, 2026

Control plane outage

Detected by Pingoru
Apr 26, 2026, 07:23 AM UTC
Resolved
Apr 26, 2026, 07:23 AM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 2 updates
  1. investigating Apr 26, 2026, 07:17 AM UTC

    Status: Investigating Our upstream auth provider is having an outage. Users unable to login to Braintrust. Affected components Web UI / Control Plane (Full outage)

  2. resolved Apr 26, 2026, 07:23 AM UTC

    Status: Resolved Provider has recovered. Error levels back to zero. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor April 20, 2026

Control Plane Database Connectivity Issues

Detected by Pingoru
Apr 20, 2026, 03:49 AM UTC
Resolved
Apr 20, 2026, 03:49 AM UTC
Duration
Affected: AI GatewayWeb UI / Control Plane
Timeline · 3 updates
  1. identified Apr 20, 2026, 03:33 AM UTC

    Status: Identified We're currently experiencing an issue with database connectivity. We're currently attempting to mitigate by connecting through an alternative method. Affected components Centrally-Hosted Data Plane (US) (Degraded performance) Web UI / Control Plane (Full outage) Centrally-Hosted Data Plane (EU) (Degraded performance) AI Gateway (Degraded performance)

  2. monitoring Apr 20, 2026, 03:46 AM UTC

    Status: Monitoring We've updated our database connection method and are observing recovery. We're monitoring to look for remaining impact. Affected components AI Gateway (Degraded performance) Centrally-Hosted Data Plane (US) (Degraded performance) Web UI / Control Plane (Full outage) Centrally-Hosted Data Plane (EU) (Degraded performance)

  3. resolved Apr 20, 2026, 03:49 AM UTC

    Status: Resolved We're no longer seeing impact from the connectivity issues and we are still investigating the root cause. Affected components AI Gateway (Operational) Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational)

Read the full incident report →

Minor March 31, 2026

Elevated control plane errors

Detected by Pingoru
Mar 31, 2026, 06:05 PM UTC
Resolved
Mar 31, 2026, 06:05 PM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 2 updates
  1. investigating Mar 31, 2026, 05:52 PM UTC

    Status: Investigating We're seeing a high rate of 5xx errors Affected components Web UI / Control Plane (Partial outage)

  2. resolved Mar 31, 2026, 06:05 PM UTC

    Status: Resolved Error rates have dropped and functionality is restored. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor March 21, 2026

Database load issues with control plane

Detected by Pingoru
Mar 21, 2026, 12:23 AM UTC
Resolved
Mar 21, 2026, 12:23 AM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 5 updates
  1. investigating Mar 20, 2026, 11:07 PM UTC

    Status: Investigating The control plane is experiencing very high load. Affected components Web UI / Control Plane (Partial outage)

  2. monitoring Mar 20, 2026, 11:14 PM UTC

    Status: Monitoring Error rates have recovered by database usage is still high. Affected components Web UI / Control Plane (Degraded performance)

  3. monitoring Mar 21, 2026, 12:09 AM UTC

    Status: Monitoring Database size is being increased. This will cause a short outage. Affected components Web UI / Control Plane (Full outage)

  4. monitoring Mar 21, 2026, 12:15 AM UTC

    Status: Monitoring DB upgrade complete. CPU usage appears to be at normal levels. Monitoring for now. Affected components Web UI / Control Plane (Operational)

  5. resolved Mar 21, 2026, 12:23 AM UTC

    Status: Resolved Load issues have subsided Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor March 20, 2026

Authorization service in degraded state

Detected by Pingoru
Mar 20, 2026, 10:32 AM UTC
Resolved
Mar 20, 2026, 10:32 AM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 3 updates
  1. investigating Mar 20, 2026, 09:53 AM UTC

    Status: Investigating Degraded performance of our authorization provider is causing issues with accessing the control plane. Affected components Web UI / Control Plane (Degraded performance)

  2. monitoring Mar 20, 2026, 10:21 AM UTC

    Status: Monitoring The authorization provider has applied a mitigation and we're seeing signs of recovery. We're continuing to monitor until all errors have subsided. Affected components Web UI / Control Plane (Operational)

  3. resolved Mar 20, 2026, 10:32 AM UTC

    Status: Resolved The authorization provider has completed mitigation work and is reporting a full recovery in their systems. We are no longer observing impact to users. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor March 3, 2026

S3 Export Automations failing

Detected by Pingoru
Mar 03, 2026, 04:24 PM UTC
Resolved
Mar 03, 2026, 04:24 PM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 2 updates
  1. investigating Mar 03, 2026, 03:50 PM UTC

    Status: Investigating S3 automations have stopped running on the expected schedule they were set to. Affected components Web UI / Control Plane (Partial outage)

  2. resolved Mar 03, 2026, 04:24 PM UTC

    Status: Resolved Affected a small subset of users. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor February 18, 2026

Control Plane UI Hanging

Detected by Pingoru
Feb 18, 2026, 09:35 PM UTC
Resolved
Feb 18, 2026, 09:35 PM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 2 updates
  1. investigating Feb 18, 2026, 09:32 PM UTC

    Status: Investigating At the current moment in time we have reports of hanging in the UI particularly in dataset or trace views. We are currently narrowing down the code change causing the issue. Please stand by we are aware and currently working to resolve. Affected components Web UI / Control Plane (Degraded performance)

  2. resolved Feb 18, 2026, 09:35 PM UTC

    Status: Resolved The issue has been resolved and you should be seeing no slowness in the Braintrust Control Plane UI. Please let us know if you have any issues. Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor February 10, 2026

Logs Page Errors

Detected by Pingoru
Feb 10, 2026, 06:43 PM UTC
Resolved
Feb 10, 2026, 06:43 PM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 4 updates
  1. investigating Feb 10, 2026, 02:49 PM UTC

    Status: Investigating We're currently experiencing issues with the Logs page displaying errors. We're investigating and will update as soon as we have more information. Affected components Web UI / Control Plane (Degraded performance)

  2. identified Feb 10, 2026, 03:12 PM UTC

    Status: Identified We've identified the issue is due to an undefined value on the Logs page in the Output column's tooltip. We are currently working to resolve. Affected components Web UI / Control Plane (Degraded performance)

  3. monitoring Feb 10, 2026, 05:34 PM UTC

    Status: Monitoring The fix for the errors on the column tooltips is deployed and we are currently monitoring to confirm resolution. Affected components Web UI / Control Plane (Degraded performance)

  4. resolved Feb 10, 2026, 06:43 PM UTC

    Status: Resolved Pushed out a fix Affected components Web UI / Control Plane (Operational)

Read the full incident report →

Minor February 4, 2026

Traces Appearing to Load Indefinitely

Detected by Pingoru
Feb 04, 2026, 04:19 PM UTC
Resolved
Feb 04, 2026, 04:19 PM UTC
Duration
Affected: Web UI / Control Plane
Timeline · 2 updates
  1. monitoring Feb 04, 2026, 04:16 PM UTC

    Status: Monitoring We experienced an issue with Playground Trace views in which they appeared to load indefinitely while successfully completing their backend processing. Affected components Web UI / Control Plane (Partial outage)

  2. resolved Feb 04, 2026, 04:19 PM UTC

    Status: Resolved We resolved the issue by reverting the inciting commit. Affected components Web UI / Control Plane (Operational)

Read the full incident report →