Cerebras Outage History

Cerebras had 24 outages in the last 2 years totaling 59h 3m of downtime — averaging 1 incident per month.

There were 24 Cerebras outages since June 8, 2025 totaling 59h 3m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.cerebras.ai

Critical May 1, 2026

qwen-3-235b-a22b-instruct-2507 Service Unavailable

Detected by Pingoru: May 01, 2026, 07:59 AM UTC
Resolved: May 01, 2026, 08:57 AM UTC
Duration: 57m

Affected: Qwen-3-235B-Instruct-2507

Timeline · 2 updates

investigating May 01, 2026, 07:59 AM UTC

The service is currently inaccessible. We are currently working urgently to restore service capabilities. We will provide further updates as we make progress.
resolved May 01, 2026, 08:57 AM UTC

This incident has been resolved.

Read the full incident report →

Notice April 3, 2026

Service disruption with qwen-3-235b-a22b-instruct-2507

Detected by Pingoru: Apr 03, 2026, 10:42 AM UTC
Resolved: Apr 03, 2026, 09:00 AM UTC
Duration: —

Timeline · 1 update

resolved Apr 03, 2026, 10:42 AM UTC

Between 9:07 AM UTC and 9:27 AM UTC users experienced service disruption with qwen-3-235b-a22b-instruct-2507. We have deployed a fix and the issue is now resolved.

Read the full incident report →

Notice April 3, 2026

Partial service disruption with GPT-OSS-120B

Detected by Pingoru: Apr 03, 2026, 10:38 AM UTC
Resolved: Apr 03, 2026, 09:00 AM UTC
Duration: —

Timeline · 1 update

resolved Apr 03, 2026, 10:38 AM UTC

Between 9:07 AM UTC and 9:18 AM UTC users experienced partial service disruption with gpt-oss-120b. We have deployed a fix and the issue is now resolved.

Read the full incident report →

Notice March 18, 2026

GLM 4.7 Partial Service Disruption

Detected by Pingoru: Mar 18, 2026, 09:40 AM UTC
Resolved: Mar 18, 2026, 05:30 AM UTC
Duration: —

Timeline · 1 update

resolved Mar 18, 2026, 09:40 AM UTC

Between 10:30 PM PST and 11:42 PM PST 03/17 users experienced partial degradation with glm-4.7. We have deployed a fix and the issue is now resolved.

Read the full incident report →

Major March 17, 2026

Z.ai GLM 4.7 Service Unavailable

Detected by Pingoru: Mar 17, 2026, 12:21 PM UTC
Resolved: Mar 17, 2026, 06:31 PM UTC
Duration: 6h 10m

Affected: ZAI-GLM-4.7

Timeline · 2 updates

identified Mar 17, 2026, 12:21 PM UTC

The service is currently inaccessible. We are currently working urgently to restore service capabilities. We will provide further updates as we make progress.
resolved Mar 17, 2026, 06:31 PM UTC

This incident has been resolved.

Read the full incident report →

Minor March 6, 2026

Partial Service Disruption

Detected by Pingoru: Mar 06, 2026, 05:29 AM UTC
Resolved: Mar 06, 2026, 05:51 AM UTC
Duration: 22m

Affected: Qwen-3-235B-Instruct-2507

Timeline · 3 updates

investigating Mar 06, 2026, 05:29 AM UTC

Qwen 235B is facing partial service disruption. We are currently working to resume normal service performance. We will provide further updates as we make progress.
monitoring Mar 06, 2026, 05:47 AM UTC

We identified the issue, applied a fix and are monitoring the endpoint.
resolved Mar 06, 2026, 05:51 AM UTC

Between 4:53 UTC ending 5:38 UTC, Qwen 3 235B endpoint experienced a partial service disruption due to a transient network issue. The issue has been identified and fixed, the endpoint is operational.

Read the full incident report →

Minor February 10, 2026

Partial Service Disruption on zai-glm-4.7

Detected by Pingoru: Feb 10, 2026, 09:37 PM UTC
Resolved: Feb 11, 2026, 06:14 AM UTC
Duration: 8h 36m

Affected: ZAI-GLM-4.7

Timeline · 3 updates

investigating Feb 10, 2026, 09:37 PM UTC

We are currently investigating an issue with zai-glm-4.7 where users are seeing an elevated number of 503 errors. Engineering is working on a resolution.
identified Feb 11, 2026, 12:12 AM UTC

We've identified a fix and our engineering team is deploying this now.
resolved Feb 11, 2026, 06:14 AM UTC

This incident has been resolved.

Read the full incident report →

Minor February 10, 2026

Partial Service Disruption on Llama-3.3-70B

Detected by Pingoru: Feb 10, 2026, 07:52 PM UTC
Resolved: Feb 10, 2026, 10:47 PM UTC
Duration: 2h 55m

Timeline · 4 updates

investigating Feb 10, 2026, 07:52 PM UTC

We are currently investigating an issue with Llama-3.3-70B where users are seeing an elevated number of 503 errors. Engineering is working on a resolution.
investigating Feb 10, 2026, 09:09 PM UTC

We've identified a fix and our engineering team is deploying this now.
investigating Feb 10, 2026, 10:47 PM UTC

Our engineering team has resolved the issue.
resolved Feb 10, 2026, 10:47 PM UTC

This incident has been resolved.

Read the full incident report →

Minor February 5, 2026

Partial Service Disruption on Cerebras Endpoints

Detected by Pingoru: Feb 05, 2026, 06:42 AM UTC
Resolved: Feb 05, 2026, 08:04 AM UTC
Duration: 1h 22m

Affected: Llama3.1-8BQwen-3-235B-Instruct-2507GPT-OSS-120BZAI-GLM-4.7Developer Console

Timeline · 2 updates

investigating Feb 05, 2026, 06:42 AM UTC

We are investigating 502 Gateway errors on Cerebras endpoints. We are currently working to resume normal service performance. We will provide further updates as we make progress.
resolved Feb 05, 2026, 08:04 AM UTC

Between 7.30 PM PT and 11 PM PT, the inference service was partially disrupted by 502 Gateway errors across all endpoints. This issue is caused by an internal system dependency, the fix was rolled out and the service is operational across all endpoints.

Read the full incident report →

Major February 4, 2026

Partial Service Disruption Qwen 235B

Detected by Pingoru: Feb 04, 2026, 04:59 PM UTC
Resolved: Feb 05, 2026, 02:07 AM UTC
Duration: 9h 7m

Affected: Qwen-3-235B-Instruct-2507

Timeline · 4 updates

investigating Feb 04, 2026, 04:59 PM UTC

Qwen 235B performance is degraded and temporarily unavailable for some service tiers. We are currently working to resume normal service performance. We will provide further updates as we make progress.
monitoring Feb 04, 2026, 06:53 PM UTC

As part of recent changes in traffic patterns and capacity, we are temporarily turning down rate limits for Pay-go users to help maintain a positive experience across the board. We understand the challenges this may create for you and your users, and we sincerely apologize for the inconvenience. We plan to monitor this endpoint and evaluate restoring rate limits in the next week. Thank you for your continued partnership and understanding, please reach out to our support team for any other questions.
investigating Feb 05, 2026, 02:07 AM UTC

Between 06:00 AM PT and 06:00 PM PT users experienced service disruption with Qwen 3 235B Instruct. We have taken action to address recent changes in traffic patterns and capacity, reducing the disruption. We plan to monitor this endpoint and evaluate restoring rate limits in the next week for Pay Go users.
resolved Feb 05, 2026, 02:07 AM UTC

Between 06:00 AM PT and 06:00 PM PT users experienced service disruption with Qwen 3 235B Instruct. We have taken action to address recent changes in traffic patterns and capacity, reducing the disruption. We plan to monitor this endpoint and evaluate restoring rate limits in the next week for Pay Go users.

Read the full incident report →

Notice January 27, 2026

Qwen-32B Service Experiencing Degraded Performance

Detected by Pingoru: Jan 27, 2026, 05:18 PM UTC
Resolved: Jan 27, 2026, 09:05 PM UTC
Duration: 3h 46m

Timeline · 3 updates

investigating Jan 27, 2026, 05:18 PM UTC

Qwen 32b is currently inaccessible. We are currently working urgently to restore service capabilities. We will provide further updates as we make progress.
identified Jan 27, 2026, 05:46 PM UTC

Partial of the service has been restored, and we are currently working to resume normal service performance
resolved Jan 27, 2026, 09:05 PM UTC

This incident has been resolved.

Read the full incident report →

Major January 18, 2026

Llama 3.3 70B Service Unavailable

Detected by Pingoru: Jan 18, 2026, 09:40 PM UTC
Resolved: Jan 19, 2026, 12:36 AM UTC
Duration: 2h 56m

Timeline · 5 updates

investigating Jan 18, 2026, 09:40 PM UTC

The service is currently inaccessible. We are currently working urgently to restore service capabilities. We will provide further updates as we make progress.
identified Jan 18, 2026, 11:52 PM UTC

The issue has been root caused and fix is being implemented to bring the service backup.
monitoring Jan 19, 2026, 12:03 AM UTC

The fix has been rolled out and the service has resumed consuming traffic and being monitored.
investigating Jan 19, 2026, 12:35 AM UTC

Between 12.45 PM PT and 4.00 PM PT users experienced service unavailability with Llama 3.3 70B, caused by a datacenter issue. We have deployed a fix and the issue is now resolved, model endpoint is operational.
resolved Jan 19, 2026, 12:36 AM UTC

Between 12.45 PM PT and 4.00 PM PT users experienced service unavailability with Llama 3.3 70B, caused by a datacenter issue. We have deployed a fix and the issue is now resolved, model endpoint is operational.

Read the full incident report →

Minor January 17, 2026

Partial Service Disruption- API Key Error

Detected by Pingoru: Jan 17, 2026, 01:47 AM UTC
Resolved: Jan 17, 2026, 02:01 AM UTC
Duration: 14m

Affected: Llama3.1-8BQwen-3-235B-Instruct-2507GPT-OSS-120BZAI-GLM-4.7

Timeline · 3 updates

investigating Jan 17, 2026, 01:47 AM UTC

Partial service disruption due to 401 Error with Unauthorized API Key. We are currently working to resume normal service performance. We will provide further updates as we make progress.
investigating Jan 17, 2026, 01:53 AM UTC

This issue is caused by an internal system dependency, and we are currently working to restore system performance.
resolved Jan 17, 2026, 02:01 AM UTC

Between 4.25 PM PT and 5.45 PM PT, developers experienced minor service disruption due to API Key Error. The issue has been identified and resolved, normal service operation is restored.

Read the full incident report →

Minor December 12, 2025

Partial Degradation of Qwen3 32B, Llama 3.1 8B, Llama 3.3 70B, and Qwen-3-235B-Instruct-2507

Detected by Pingoru: Dec 12, 2025, 03:33 PM UTC
Resolved: Dec 12, 2025, 06:04 PM UTC
Duration: 2h 30m

Affected: Llama3.1-8BQwen-3-235B-Instruct-2507

Timeline · 7 updates

investigating Dec 12, 2025, 03:33 PM UTC

We are currently working to resume normal service performance. We will provide further updates as we make progress.
investigating Dec 12, 2025, 03:36 PM UTC

We are continuing to investigate this issue.
investigating Dec 12, 2025, 03:37 PM UTC

We are continuing to investigate this issue.
monitoring Dec 12, 2025, 03:58 PM UTC

We've deployed the fix and affected service is now recovering. We are actively monitoring service performance.
investigating Dec 12, 2025, 05:47 PM UTC

Between 06:50 AM PST and 07:45 AM PST users experienced partial degradation with llama3.1-8b, llama-3.3-70b, qwen-3-32b, qwen-3-235b-instruct-2507 models. We have deployed a fix and the issue is now resolved.
investigating Dec 12, 2025, 05:48 PM UTC

Between 06:50 AM PST and 07:45 AM PST users experienced partial degradation with llama3.1-8b, llama-3.3-70b, qwen-3-32b, qwen-3-235b-instruct-2507 models. We have deployed a fix and the issue is now resolved.
resolved Dec 12, 2025, 06:04 PM UTC

Between 06:50 AM PST and 07:45 AM PST users experienced partial degradation with llama3.1-8b, llama-3.3-70b, qwen-3-32b, qwen-3-235b-instruct-2507 models. We have deployed a fix and the issue is now resolved.

Read the full incident report →

Minor December 12, 2025

Partial Degradation of Qwen3 235B Instruct

Detected by Pingoru: Dec 12, 2025, 04:20 AM UTC
Resolved: Dec 11, 2025, 09:30 AM UTC
Duration: —

Timeline · 1 update

resolved Dec 12, 2025, 04:20 AM UTC

Between 01:30 AM PT and 2:30 PM PT users experienced partial degradation with Qwen 3 235B Instruct. We have deployed a fix and the issue is now resolved.

Read the full incident report →

Critical November 18, 2025

Identified - Caused by External Depencency

Detected by Pingoru: Nov 18, 2025, 12:30 PM UTC
Resolved: Nov 18, 2025, 03:36 PM UTC
Duration: 3h 5m

Affected: Llama3.1-8BQwen-3-235B-Instruct-2507GPT-OSS-120BDeveloper Console

Timeline · 7 updates

investigating Nov 18, 2025, 12:30 PM UTC

Around 4:09 AM PT, Inference Platform went down causing 500 Internal Server Errors. This issue is caused by a third-party system, and we are currently monitoring.
investigating Nov 18, 2025, 12:54 PM UTC

We are continuing to investigate this issue.
investigating Nov 18, 2025, 02:24 PM UTC

We are continuing to investigate this issue.
identified Nov 18, 2025, 02:41 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Nov 18, 2025, 02:55 PM UTC

A fix has been implemented and we are monitoring the results.
monitoring Nov 18, 2025, 02:57 PM UTC

The platform is accessible now. We are monitoring to ensure stability.
resolved Nov 18, 2025, 03:36 PM UTC

This incident has been resolved.

Read the full incident report →

Notice November 8, 2025

Resolved

Detected by Pingoru: Nov 08, 2025, 04:09 PM UTC
Resolved: Nov 08, 2025, 09:30 AM UTC
Duration: —

Timeline · 1 update

resolved Nov 08, 2025, 04:09 PM UTC

Between 1:30 AM PST and 2:30 AM PST users experienced service disruption with ZAI-GLM-4.6. We have deployed a fix and the issue is now resolved.

Read the full incident report →

Notice October 22, 2025

Service Degradation

Detected by Pingoru: Oct 22, 2025, 04:18 PM UTC
Resolved: Oct 22, 2025, 09:13 PM UTC
Duration: 4h 54m

Affected: Qwen-3-235B-Instruct-2507GPT-OSS-120B

Timeline · 5 updates

investigating Oct 22, 2025, 04:18 PM UTC

We are experiencing degradation of service across several of our models. Our engineering team is working to identify the root cause and restore stability to our systems.
identified Oct 22, 2025, 05:01 PM UTC

We've identified that the combination of high load and a network routing issue resulted in higher queue times and rejection rates. ETA on resolution is a few hours.
identified Oct 22, 2025, 06:01 PM UTC

Our teams are continuing to work to resolve this issue. We appreciate your patience.
monitoring Oct 22, 2025, 07:44 PM UTC

A fix has been implemented, availability is returning to 100%, and latency times are improving. We will continue to monitor over the next few hours.
resolved Oct 22, 2025, 09:13 PM UTC

This incident has been resolved.

Read the full incident report →

Minor October 19, 2025

Datacenter Issue Causing Minimal Degredation

Detected by Pingoru: Oct 19, 2025, 02:37 AM UTC
Resolved: Oct 19, 2025, 02:38 PM UTC
Duration: 12h 1m

Affected: GPT-OSS-120B

Timeline · 4 updates

investigating Oct 19, 2025, 02:37 AM UTC

We are experiencing issues with one of our datacenters. We are actively working on mitigation. Seeing minimal degradation of service at this point. Fortunately, this only impacts one datacenter, the rest are operational, and redundancies are in place. ETA recovery time for datacenter: 8 hours. We will actively keep you updated on the progress.
investigating Oct 19, 2025, 02:37 AM UTC

We are continuing to investigate this issue.
identified Oct 19, 2025, 02:41 AM UTC

Update: The issue has been root caused to a city water supply disruption. We are actively working on mitigation.
resolved Oct 19, 2025, 02:38 PM UTC

Between Oct 18th 7 PM PT and Oct 19th 6.30 AM PT users experienced minor service disruption with Llama 3.3 70B, Qwen 3 32B, GPT OSS 120B, Qwen 3 235B Thinking, Qwen 3 Coder 480B because of a datacenter issue. We have deployed the fix and the issue is now resolved.

Read the full incident report →

Major August 22, 2025

Reduced availability for Llama-3.3-70B

Detected by Pingoru: Aug 22, 2025, 04:47 PM UTC
Resolved: Aug 22, 2025, 04:46 PM UTC
Duration: —

Timeline · 1 update

resolved Aug 22, 2025, 04:47 PM UTC

Llama-3.3-70B endpoint experienced reduced availability starting at approximately 7:45am PT. Endpoint went back to full availability starting 8:55am PT.

Read the full incident report →

Minor July 31, 2025

Reduced Availability - Resolved

Detected by Pingoru: Jul 31, 2025, 06:00 PM UTC
Resolved: Jul 31, 2025, 06:00 PM UTC
Duration: —

Timeline · 1 update

resolved Jul 31, 2025, 09:42 PM UTC

We observed a temporary increase in latency and intermittent 503 errors for the Llama-3.3-70B model earlier today (between 10:59am Pacific to 11:09am Pacific on 7/31/2025). This was due to ongoing infrastructure operations, including system scaling and maintenance, which briefly reduced available capacity. Service availability and performance are now stabilized.

Read the full incident report →

Notice July 25, 2025

Service Unavailable - Resolved

Detected by Pingoru: Jul 25, 2025, 10:23 PM UTC
Resolved: Jul 25, 2025, 08:00 PM UTC
Duration: —

Timeline · 1 update

resolved Jul 25, 2025, 10:23 PM UTC

Llama-3.3-70B, DeepSeek-R1-Distill-Llama-70B, Qwen-3-32B (7/25/2025 from 1:17 PM to 1:41 PM PT) were unavailable. We restored normal service performance at 1:41 PM PT on July 25th.

Read the full incident report →

Major June 25, 2025

Qwen-3-32B API Incident – Resolved

Detected by Pingoru: Jun 25, 2025, 02:30 PM UTC
Resolved: Jun 25, 2025, 02:30 PM UTC
Duration: —

Timeline · 1 update

resolved Jun 25, 2025, 06:12 PM UTC

We had a brief issue this morning (6/25/2025 from 7:25 AM to 9:40 AM PT) affecting the Qwen-3-32B shared endpoint. Some users may have seen 404 or 503 errors during that time.

Read the full incident report →

Major June 19, 2025

Service Unavailable

Detected by Pingoru: Jun 19, 2025, 06:30 PM UTC
Resolved: Jun 19, 2025, 06:30 PM UTC
Duration: —

Timeline · 1 update

resolved Jun 20, 2025, 01:53 PM UTC

Llama3.1-8B was unavailable. We restored normal service performance at 3:21 PM EST on June 19th.

Read the full incident report →