Weights & Biases Outage History

Weights & Biases is up right now

Weights & Biases had 17 outages in the last 2 years totaling 259h 19m of downtime — averaging 0.7 incidents per month.

There were 17 Weights & Biases outages since January 13, 2026 totaling 259h 19m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.wandb.com

Critical June 14, 2026

Web App loading issue

Detected by Pingoru
Jun 14, 2026, 07:37 AM UTC
Resolved
Jun 14, 2026, 08:59 PM UTC
Duration
13h 22m
Affected: Frontend
Timeline · 4 updates
  1. investigating Jun 14, 2026, 07:37 AM UTC

    We're investigating reports of the W&B web app (app.wandb.ai) showing a blank page on load for some users. The API and SDK data logging remain operational. A fix is in progress and we'll share an update soon.

  2. investigating Jun 14, 2026, 07:37 AM UTC

    We are continuing to investigate this issue.

  3. monitoring Jun 14, 2026, 09:30 AM UTC

    A fix has been deployed and we are monitoring.

  4. resolved Jun 14, 2026, 08:59 PM UTC

    Web app has been restored.

Read the full incident report →

Minor June 9, 2026

Metric ingestion delayed

Detected by Pingoru
Jun 09, 2026, 11:47 PM UTC
Resolved
Jun 10, 2026, 07:08 AM UTC
Duration
7h 20m
Affected: Metrics Ingestion Pipeline
Timeline · 3 updates
  1. identified Jun 09, 2026, 11:47 PM UTC

    We are currently responding to delay of up to 5 hours for metrics written into wandb. We have identified the issue and are working to process metrics as quickly as possible. There is no data loss and all data will be complete once the backlog has drained. We're very sorry for the disruption.

  2. monitoring Jun 10, 2026, 05:44 AM UTC

    We identified an infrastructure issue as the root cause and have resolved it. We are rapidly working through the backlog of ingested metrics and should fully catch up soon.

  3. resolved Jun 10, 2026, 07:08 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor June 9, 2026

Metrics ingestion delay

Detected by Pingoru
Jun 09, 2026, 09:32 AM UTC
Resolved
Jun 09, 2026, 09:55 AM UTC
Duration
22m
Affected: Metrics Ingestion Pipeline
Timeline · 2 updates
  1. investigating Jun 09, 2026, 09:32 AM UTC

    We are currently investigating an issue where metric ingestion is delayed for some runs up to 1 hour.

  2. resolved Jun 09, 2026, 09:55 AM UTC

    This incident has been resolved

Read the full incident report →

Notice May 26, 2026

Slowness and missing metrics

Detected by Pingoru
May 26, 2026, 07:47 PM UTC
Resolved
May 27, 2026, 02:02 AM UTC
Duration
6h 15m
Affected: Backend APIFrontend
Timeline · 2 updates
  1. monitoring May 26, 2026, 07:47 PM UTC

    Starting at 10:48AM PST we experienced an incident causing missing metrics and slowness across the application. As of 12:10 PM metrics should now be restored (there was no data loss) and we are continuing to monitor performance.

  2. resolved May 27, 2026, 02:02 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor May 15, 2026

Delayed Metrics Processing – Workspace & API Updates Impacted

Detected by Pingoru
May 15, 2026, 01:51 PM UTC
Resolved
May 15, 2026, 04:39 PM UTC
Duration
2h 47m
Affected: Backend APIMetrics Ingestion Pipeline
Timeline · 2 updates
  1. identified May 15, 2026, 01:51 PM UTC

    We are currently experiencing delays in processing newly logged metrics. As a result, recent metrics from active runs may take longer than usual to appear in charts, dashboards, reports, and API queries. No data is lost. All metrics are being processed and will become visible once the backlog clears. We are actively working to restore normal processing times and will provide updates here.

  2. resolved May 15, 2026, 04:39 PM UTC

    This incident has been resolved.

Read the full incident report →

Notice May 10, 2026

Run Update Ingest Delays

Detected by Pingoru
May 10, 2026, 01:19 PM UTC
Resolved
May 10, 2026, 06:09 PM UTC
Duration
4h 50m
Affected: Backend API
Timeline · 3 updates
  1. identified May 10, 2026, 01:19 PM UTC

    We identified an issue where updates to W&B Models runs may be delayed for up to 30 minutes. There is no data loss.

  2. monitoring May 10, 2026, 02:56 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved May 10, 2026, 06:09 PM UTC

    This incident has been resolved.

Read the full incident report →

Notice May 4, 2026

API Errors

Detected by Pingoru
May 04, 2026, 03:50 PM UTC
Resolved
May 05, 2026, 04:03 AM UTC
Duration
12h 13m
Affected: Backend APIFrontend
Timeline · 6 updates
  1. investigating May 04, 2026, 03:50 PM UTC

    We are currently investigating this issue.

  2. monitoring May 04, 2026, 05:57 PM UTC

    The frontend and backend APIs have fully recovered after we mitigated excess load on the system. We will continue to actively monitor the situation to ensure stability.

  3. investigating May 04, 2026, 10:42 PM UTC

    We are currently investigating this issue.

  4. monitoring May 05, 2026, 12:31 AM UTC

    A fix has been implemented and we are monitoring the results.

  5. monitoring May 05, 2026, 12:36 AM UTC

    We are continuing to monitor for any further issues.

  6. resolved May 05, 2026, 04:03 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor April 28, 2026

Backend API Errors

Detected by Pingoru
Apr 28, 2026, 10:41 PM UTC
Resolved
Apr 29, 2026, 02:07 AM UTC
Duration
3h 26m
Affected: Backend APIFrontend
Timeline · 2 updates
  1. investigating Apr 28, 2026, 10:41 PM UTC

    We are currently investigating an issue where API requests may sporadically fail.

  2. resolved Apr 29, 2026, 02:07 AM UTC

    This incident has been resolved.

Read the full incident report →

Notice April 22, 2026

Elevated API Errors

Detected by Pingoru
Apr 22, 2026, 06:00 AM UTC
Resolved
Apr 28, 2026, 04:30 AM UTC
Duration
5d 22h
Affected: Backend APIFrontend
Timeline · 4 updates
  1. investigating Apr 22, 2026, 06:51 AM UTC

    We are currently investigating an issue where requests are failing.

  2. monitoring Apr 22, 2026, 07:43 AM UTC

    The issue has been identified and a fix has been rolled out. We are monitoring for any further issues.

  3. monitoring Apr 22, 2026, 03:30 PM UTC

    We are continuing to monitor for any further issues.

  4. resolved Apr 28, 2026, 04:30 AM UTC

    This incident has been resolved.

Read the full incident report →

Notice April 6, 2026

API and UI outage

Detected by Pingoru
Apr 06, 2026, 03:26 PM UTC
Resolved
Apr 07, 2026, 04:22 AM UTC
Duration
12h 56m
Affected: Backend APIFrontend
Timeline · 4 updates
  1. identified Apr 06, 2026, 03:26 PM UTC

    The issue has been identified and a fix is being implemented.

  2. monitoring Apr 06, 2026, 04:51 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. monitoring Apr 06, 2026, 05:12 PM UTC

    We are seeing the site recover and be fully operational - we are continuing to actively monitor the situation to ensure stability.

  4. resolved Apr 07, 2026, 04:22 AM UTC

    This incident has been resolved.

Read the full incident report →

Major April 2, 2026

Delay in files and media getting uploaded

Detected by Pingoru
Apr 02, 2026, 02:15 AM UTC
Resolved
Apr 02, 2026, 03:00 AM UTC
Duration
44m
Affected: Backend API
Timeline · 3 updates
  1. identified Apr 02, 2026, 02:15 AM UTC

    We have identified a delay in files and media getting uploaded.

  2. identified Apr 02, 2026, 02:15 AM UTC

    We are continuing to work on a fix for this issue.

  3. resolved Apr 02, 2026, 03:00 AM UTC

    This incident has been resolved.

Read the full incident report →

Major March 31, 2026

W&B Inference Maintenance

Detected by Pingoru
Mar 31, 2026, 04:52 PM UTC
Resolved
Mar 31, 2026, 06:12 PM UTC
Duration
1h 19m
Affected: Inference Endpoint
Timeline · 2 updates
  1. investigating Mar 31, 2026, 04:52 PM UTC

    The W&B Inference endpoint is currently unavailable. This is due to some maintenance on the gateway that was scheduled but that we missed the announcement step for. We expect intermittent outages between 9am and 11am PDT. We apologize for any inconvenience.

  2. resolved Mar 31, 2026, 06:12 PM UTC

    Maintenance has completed.

Read the full incident report →

Minor March 11, 2026

Elevated API request latencies

Detected by Pingoru
Mar 11, 2026, 11:58 PM UTC
Resolved
Mar 12, 2026, 11:33 PM UTC
Duration
23h 34m
Affected: Backend API
Timeline · 3 updates
  1. investigating Mar 11, 2026, 05:50 PM UTC

    We are aware of an issue causing elevated API request latencies, resulting in degraded performance in the UI and SDK. Recent changes in traffic patterns are resulting in increased load, which we urgently are working to address.

  2. monitoring Mar 11, 2026, 11:58 PM UTC

    We've deployed mitigations and are actively monitoring. Request latencies have decreased but are not yet back to baseline. We appreciate your patience as we work to resolve this issue.

  3. resolved Mar 12, 2026, 11:33 PM UTC

    We've been monitoring request latencies over the past 24 hours and they are back within our normal ranges.

Read the full incident report →

Minor March 5, 2026

High latency - Workspaces & API Impacted

Detected by Pingoru
Mar 05, 2026, 07:03 PM UTC
Resolved
Mar 06, 2026, 12:19 AM UTC
Duration
5h 15m
Affected: Backend APIFrontend
Timeline · 5 updates
  1. investigating Mar 05, 2026, 07:03 PM UTC

    We're experiencing an elevated level of latency across the site and API. We are actively investigating the issue.

  2. investigating Mar 05, 2026, 07:07 PM UTC

    We are continuing to investigate this issue.

  3. investigating Mar 05, 2026, 08:02 PM UTC

    We have implemented a set of mitigations - site performance is significantly improved and we are investigating further fixes.

  4. monitoring Mar 05, 2026, 09:36 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Mar 06, 2026, 12:19 AM UTC

    UI & API latency has returned to normal ranges

Read the full incident report →

Notice February 9, 2026

Error loading some projects with active runs

Detected by Pingoru
Feb 09, 2026, 09:41 PM UTC
Resolved
Feb 10, 2026, 12:07 AM UTC
Duration
2h 26m
Affected: Backend API
Timeline · 2 updates
  1. identified Feb 09, 2026, 09:41 PM UTC

    A recent deployment caused an issue affecting read access to some projects with active runs. We're actively working on fixing the issue.

  2. resolved Feb 10, 2026, 12:07 AM UTC

    This incident has been resolved.

Read the full incident report →

Notice January 21, 2026

Elevated latency affecting app and SDK

Detected by Pingoru
Jan 21, 2026, 04:09 PM UTC
Resolved
Jan 21, 2026, 10:25 PM UTC
Duration
6h 16m
Affected: Backend API
Timeline · 3 updates
  1. investigating Jan 21, 2026, 04:09 PM UTC

    We are currently investigating elevated latency across the application since 7:08am PST.

  2. monitoring Jan 21, 2026, 04:45 PM UTC

    Issue appears resolved as of 8:18am. We are continuing to monitor.

  3. resolved Jan 21, 2026, 10:25 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor January 15, 2026

Ingestion delay for some projects

Detected by Pingoru
Jan 15, 2026, 04:48 AM UTC
Resolved
Jan 15, 2026, 06:26 PM UTC
Duration
13h 38m
Affected: Backend APIFrontend
Timeline · 3 updates
  1. identified Jan 15, 2026, 04:48 AM UTC

    We are aware of ingestion delay for some projects and working to resolve it.

  2. identified Jan 15, 2026, 07:13 AM UTC

    We have implemented a mitigation and are making progress against the backlog.

  3. resolved Jan 15, 2026, 06:26 PM UTC

    Our systems have fully caught up with the backlog. The API should correctly return up-to-date metrics for experiments in all projects. We sincerely appreciate your patience as we worked to resolve the issue.

Read the full incident report →