- Detected by Pingoru
- Jun 12, 2026, 06:26 PM UTC
- Resolved
- Jun 12, 2026, 06:26 PM UTC
- Duration
- —
Affected: AI GatewayWeb UI / Control Plane
Timeline · 2 updates
-
investigating Jun 12, 2026, 06:16 PM UTC
Status: Investigating Our US data plane is encountering issues. Affected components Centrally-Hosted Data Plane (US) (Full outage) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)
-
resolved Jun 12, 2026, 06:26 PM UTC
Status: Resolved A configuration change was identified as the root cause and reverted. Affected components Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)
Read the full incident report →
- Detected by Pingoru
- Jun 04, 2026, 11:07 PM UTC
- Resolved
- Jun 04, 2026, 11:07 PM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 2 updates
-
investigating Jun 04, 2026, 10:51 PM UTC
Status: Investigating We're experiencing issues loading logs and some other pages in Braintrust. Our team is actively investigating and working to resolve. Affected components Web UI / Control Plane (Partial outage)
-
resolved Jun 04, 2026, 11:07 PM UTC
Status: Resolved We have pushed a fix and the loading issues have been remediated. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- May 26, 2026, 10:11 PM UTC
- Resolved
- May 26, 2026, 10:11 PM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 3 updates
-
investigating May 26, 2026, 09:33 PM UTC
Status: Investigating Our upstream auth provider is having issues. Fresh logins to Braintrust will fail with an error. Affected components Web UI / Control Plane (Partial outage)
-
investigating May 26, 2026, 10:10 PM UTC
Status: Investigating Our auth provider has recovered. Affected components Web UI / Control Plane (Operational)
-
resolved May 26, 2026, 10:11 PM UTC
Status: Resolved Our auth provider has recovered. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- May 16, 2026, 06:50 PM UTC
- Resolved
- May 16, 2026, 06:50 PM UTC
- Duration
- —
Affected: AI GatewayWeb UI / Control Plane
Timeline · 2 updates
-
investigating May 15, 2026, 08:00 PM UTC
Status: Investigating On May 15, some automations stopped running which impacted scorer executions. Affected components Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Degraded performance) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)
-
resolved May 16, 2026, 06:50 PM UTC
Status: Resolved We determined the issue was caused by a role used by the automation service losing required permissions following a configuration change. The issue was reported on May 15 at approximately 1:00 PM PT and resolved on May 16 at 11:50 AM PT. To help prevent similar incidents, we have improved our alerting and added audit logging for role and permission changes. Affected components Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational) AI Gateway (Operational)
Read the full incident report →
- Detected by Pingoru
- May 06, 2026, 12:41 AM UTC
- Resolved
- May 06, 2026, 12:41 AM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 5 updates
-
investigating May 05, 2026, 11:35 PM UTC
Status: Investigating We're observing a high rate of errors on the production control plane which is causing errors on braintrust.dev and data plane operations. Affected components Web UI / Control Plane (Partial outage)
-
identified May 06, 2026, 12:08 AM UTC
Status: Identified We have identified the cause and are working with an upstream provider on a fix. Affected components Web UI / Control Plane (Partial outage)
-
monitoring May 06, 2026, 12:10 AM UTC
Status: Monitoring We are actively monitoring the fix. Affected components Web UI / Control Plane (Partial outage)
-
resolved May 06, 2026, 12:23 AM UTC
Status: Resolved We are no longer seeing issues. Affected components Web UI / Control Plane (Operational)
-
resolved May 06, 2026, 12:41 AM UTC
Status: Resolved The issue was caused by our firewall provider unexpectedly blocking traffic despite safeguards we had in place to prevent this behavior. We are actively working with the vendor on additional mitigation measures to reduce the likelihood of this occurring again. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Apr 26, 2026, 07:23 AM UTC
- Resolved
- Apr 26, 2026, 07:23 AM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 2 updates
-
investigating Apr 26, 2026, 07:17 AM UTC
Status: Investigating Our upstream auth provider is having an outage. Users unable to login to Braintrust. Affected components Web UI / Control Plane (Full outage)
-
resolved Apr 26, 2026, 07:23 AM UTC
Status: Resolved Provider has recovered. Error levels back to zero. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Apr 20, 2026, 03:49 AM UTC
- Resolved
- Apr 20, 2026, 03:49 AM UTC
- Duration
- —
Affected: AI GatewayWeb UI / Control Plane
Timeline · 3 updates
-
identified Apr 20, 2026, 03:33 AM UTC
Status: Identified We're currently experiencing an issue with database connectivity. We're currently attempting to mitigate by connecting through an alternative method. Affected components Centrally-Hosted Data Plane (US) (Degraded performance) Web UI / Control Plane (Full outage) Centrally-Hosted Data Plane (EU) (Degraded performance) AI Gateway (Degraded performance)
-
monitoring Apr 20, 2026, 03:46 AM UTC
Status: Monitoring We've updated our database connection method and are observing recovery. We're monitoring to look for remaining impact. Affected components AI Gateway (Degraded performance) Centrally-Hosted Data Plane (US) (Degraded performance) Web UI / Control Plane (Full outage) Centrally-Hosted Data Plane (EU) (Degraded performance)
-
resolved Apr 20, 2026, 03:49 AM UTC
Status: Resolved We're no longer seeing impact from the connectivity issues and we are still investigating the root cause. Affected components AI Gateway (Operational) Centrally-Hosted Data Plane (US) (Operational) Web UI / Control Plane (Operational) Centrally-Hosted Data Plane (EU) (Operational)
Read the full incident report →
- Detected by Pingoru
- Apr 13, 2026, 10:21 PM UTC
- Resolved
- Apr 13, 2026, 09:51 PM UTC
- Duration
- —
Read the full incident report →
- Detected by Pingoru
- Mar 31, 2026, 06:05 PM UTC
- Resolved
- Mar 31, 2026, 06:05 PM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 2 updates
-
investigating Mar 31, 2026, 05:52 PM UTC
Status: Investigating We're seeing a high rate of 5xx errors Affected components Web UI / Control Plane (Partial outage)
-
resolved Mar 31, 2026, 06:05 PM UTC
Status: Resolved Error rates have dropped and functionality is restored. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Mar 21, 2026, 12:23 AM UTC
- Resolved
- Mar 21, 2026, 12:23 AM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 5 updates
-
investigating Mar 20, 2026, 11:07 PM UTC
Status: Investigating The control plane is experiencing very high load. Affected components Web UI / Control Plane (Partial outage)
-
monitoring Mar 20, 2026, 11:14 PM UTC
Status: Monitoring Error rates have recovered by database usage is still high. Affected components Web UI / Control Plane (Degraded performance)
-
monitoring Mar 21, 2026, 12:09 AM UTC
Status: Monitoring Database size is being increased. This will cause a short outage. Affected components Web UI / Control Plane (Full outage)
-
monitoring Mar 21, 2026, 12:15 AM UTC
Status: Monitoring DB upgrade complete. CPU usage appears to be at normal levels. Monitoring for now. Affected components Web UI / Control Plane (Operational)
-
resolved Mar 21, 2026, 12:23 AM UTC
Status: Resolved Load issues have subsided Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Mar 20, 2026, 10:32 AM UTC
- Resolved
- Mar 20, 2026, 10:32 AM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 3 updates
-
investigating Mar 20, 2026, 09:53 AM UTC
Status: Investigating Degraded performance of our authorization provider is causing issues with accessing the control plane. Affected components Web UI / Control Plane (Degraded performance)
-
monitoring Mar 20, 2026, 10:21 AM UTC
Status: Monitoring The authorization provider has applied a mitigation and we're seeing signs of recovery. We're continuing to monitor until all errors have subsided. Affected components Web UI / Control Plane (Operational)
-
resolved Mar 20, 2026, 10:32 AM UTC
Status: Resolved The authorization provider has completed mitigation work and is reporting a full recovery in their systems. We are no longer observing impact to users. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Mar 10, 2026, 07:25 PM UTC
- Resolved
- Mar 10, 2026, 07:11 PM UTC
- Duration
- —
Read the full incident report →
- Detected by Pingoru
- Mar 03, 2026, 04:24 PM UTC
- Resolved
- Mar 03, 2026, 04:24 PM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 2 updates
-
investigating Mar 03, 2026, 03:50 PM UTC
Status: Investigating S3 automations have stopped running on the expected schedule they were set to. Affected components Web UI / Control Plane (Partial outage)
-
resolved Mar 03, 2026, 04:24 PM UTC
Status: Resolved Affected a small subset of users. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Feb 18, 2026, 09:35 PM UTC
- Resolved
- Feb 18, 2026, 09:35 PM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 2 updates
-
investigating Feb 18, 2026, 09:32 PM UTC
Status: Investigating At the current moment in time we have reports of hanging in the UI particularly in dataset or trace views. We are currently narrowing down the code change causing the issue. Please stand by we are aware and currently working to resolve. Affected components Web UI / Control Plane (Degraded performance)
-
resolved Feb 18, 2026, 09:35 PM UTC
Status: Resolved The issue has been resolved and you should be seeing no slowness in the Braintrust Control Plane UI. Please let us know if you have any issues. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Feb 10, 2026, 06:43 PM UTC
- Resolved
- Feb 10, 2026, 06:43 PM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 4 updates
-
investigating Feb 10, 2026, 02:49 PM UTC
Status: Investigating We're currently experiencing issues with the Logs page displaying errors. We're investigating and will update as soon as we have more information. Affected components Web UI / Control Plane (Degraded performance)
-
identified Feb 10, 2026, 03:12 PM UTC
Status: Identified We've identified the issue is due to an undefined value on the Logs page in the Output column's tooltip. We are currently working to resolve. Affected components Web UI / Control Plane (Degraded performance)
-
monitoring Feb 10, 2026, 05:34 PM UTC
Status: Monitoring The fix for the errors on the column tooltips is deployed and we are currently monitoring to confirm resolution. Affected components Web UI / Control Plane (Degraded performance)
-
resolved Feb 10, 2026, 06:43 PM UTC
Status: Resolved Pushed out a fix Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Feb 04, 2026, 04:19 PM UTC
- Resolved
- Feb 04, 2026, 04:19 PM UTC
- Duration
- —
Affected: Web UI / Control Plane
Timeline · 2 updates
-
monitoring Feb 04, 2026, 04:16 PM UTC
Status: Monitoring We experienced an issue with Playground Trace views in which they appeared to load indefinitely while successfully completing their backend processing. Affected components Web UI / Control Plane (Partial outage)
-
resolved Feb 04, 2026, 04:19 PM UTC
Status: Resolved We resolved the issue by reverting the inciting commit. Affected components Web UI / Control Plane (Operational)
Read the full incident report →
- Detected by Pingoru
- Feb 03, 2026, 05:23 PM UTC
- Resolved
- Feb 03, 2026, 11:20 AM UTC
- Duration
- —
Read the full incident report →
- Detected by Pingoru
- Dec 17, 2025, 11:27 PM UTC
- Resolved
- Dec 18, 2025, 12:09 AM UTC
- Duration
- 42m
Read the full incident report →
- Detected by Pingoru
- Nov 24, 2025, 08:43 PM UTC
- Resolved
- Nov 24, 2025, 09:05 PM UTC
- Duration
- 22m
Read the full incident report →
- Detected by Pingoru
- Oct 31, 2025, 06:53 PM UTC
- Resolved
- Nov 02, 2025, 09:03 PM UTC
- Duration
- 2d 2h
Read the full incident report →
- Detected by Pingoru
- Oct 29, 2025, 08:35 PM UTC
- Resolved
- Oct 29, 2025, 09:08 PM UTC
- Duration
- 32m
Read the full incident report →
- Detected by Pingoru
- Oct 22, 2025, 08:54 PM UTC
- Resolved
- Oct 22, 2025, 10:31 PM UTC
- Duration
- 1h 36m
Read the full incident report →
Critical October 20, 2025 - Detected by Pingoru
- Oct 20, 2025, 07:19 AM UTC
- Resolved
- Oct 20, 2025, 10:57 AM UTC
- Duration
- 3h 38m
Read the full incident report →
- Detected by Pingoru
- Sep 23, 2025, 07:30 PM UTC
- Resolved
- Sep 24, 2025, 01:24 AM UTC
- Duration
- 5h 54m
Read the full incident report →
- Detected by Pingoru
- Sep 03, 2025, 05:40 PM UTC
- Resolved
- Sep 03, 2025, 06:07 PM UTC
- Duration
- 26m
Read the full incident report →