CAST AI Outage History

CAST AI is up right now

CAST AI had 12 outages in the last 2 years totaling 16h 6m of downtime — averaging 0.5 incidents per month.

There were 12 CAST AI outages since October 20, 2025 totaling 16h 6m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.cast.ai

Minor June 16, 2026

Degraded: CAST AI node provisioning issues on AKS with Kubernetes 1.36

Detected by Pingoru
Jun 16, 2026, 02:47 PM UTC
Resolved
Jun 16, 2026, 02:47 PM UTC
Duration
Affected: EU Node AutoscalerNode Autoscaler
Timeline · 3 updates
  1. identified Jun 16, 2026, 11:14 AM UTC

    Status: Identified CAST AI node provisioning fails on AKS clusters running Kubernetes 1.36.0 or above. This is caused by Azure removing support for Ubuntu 22.04 (the OS currently used by CAST AI for AKS worker nodes) on Kubernetes 1.36+. Impact: Clusters upgraded to Kubernetes 1.36 on AKS will be unable to provision new nodes via CAST AI. Workaround: Do not upgrade AKS control plane to Kubernetes 1.36 until further notice. Next update: We are working on Ubuntu 24.04 support / temporary workaround and will provide an update once a fix is available. Affected components EU Node Autoscaler (Degraded performance) Node Autoscaler (Degraded performance)

  2. monitoring Jun 16, 2026, 02:08 PM UTC

    Status: Monitoring We have identified a workaround which allows nodes to continue provisioning normally for the affected clusters. Node functionality is not impacted. If you have already upgraded to 1.36 and are experiencing provisioning failures, please reach out to support and we will assist in applying the workaround. A permanent fix with Ubuntu 24.04 support is in progress. We will notify you once available. Affected components EU Node Autoscaler (Operational) Node Autoscaler (Operational)

  3. resolved Jun 16, 2026, 02:47 PM UTC

    Status: Resolved We have implemented a workaround and successfully remediated all affected clusters. Node provisioning is now functioning normally, and node functionality is not impacted. If you are running Kubernetes 1.36 and experience any node provisioning issues, please contact Support and we will be happy to assist. For any further questions, please reach out to Support or your Technical Account Manager. Affected components EU Node Autoscaler (Operational) Node Autoscaler (Operational)

Read the full incident report →

Minor June 10, 2026

Degraded Performance - api.cast.ai

Detected by Pingoru
Jun 10, 2026, 07:27 PM UTC
Resolved
Jun 10, 2026, 07:27 PM UTC
Duration
Affected: Uptime of public API: api.cast.ai
Timeline · 2 updates
  1. investigating Jun 10, 2026, 06:12 PM UTC

    Status: Investigating We are currently experiencing degraded performance with api.cast.ai, Autoscaling remains unaffected . Our engineering team is actively investigating the issue and working towards a resolution. We will provide updates as more information becomes available Affected components Uptime of public API: api.cast.ai (Degraded performance)

  2. resolved Jun 10, 2026, 07:27 PM UTC

    Status: Resolved The issue affecting the availability of api.cast.ai has been resolved. Affected components Uptime of public API: api.cast.ai (Operational)

Read the full incident report →

Minor June 5, 2026

Intermittent Azure AKS Node Provisioning Failures in Select Regions

Detected by Pingoru
Jun 05, 2026, 06:42 AM UTC
Resolved
Jun 05, 2026, 06:42 AM UTC
Duration
Affected: EU Node AutoscalerNode Autoscaler
Timeline · 4 updates
  1. identified Jun 04, 2026, 12:17 PM UTC

    Status: Identified We are investigating an issue affecting a small number of Azure AKS clusters where node provisioning can fail. Current impact: • New node provisioning ("Add Node") may fail on affected Azure AKS clusters. • Existing workloads and running nodes are not impacted. • We are working on a permanent fix and monitoring all AKS clusters for further occurrences. We will provide updates as more information becomes available. Affected components EU Node Autoscaler (Degraded performance) Node Autoscaler (Degraded performance)

  2. monitoring Jun 04, 2026, 04:47 PM UTC

    Status: Monitoring We continue to investigate intermittent Azure AKS node provisioning failures affecting a limited number of clusters in select regions. A mitigation has been applied to all currently affected clusters, and no additional customer clusters have been observed experiencing the issue at this time. We have identified the cause and are actively testing additional fixes to ensure compatibility with recent changes introduced by Azure's node bootstrap process. Our team is closely monitoring for any new occurrences and will provide further updates as progress continues. Affected components EU Node Autoscaler (Degraded performance) Node Autoscaler (Degraded performance)

  3. monitoring Jun 04, 2026, 07:40 PM UTC

    Status: Monitoring Azure AKS node provisioning is currently operating normally. We will continue monitoring closely. Affected components Node Autoscaler (Operational) EU Node Autoscaler (Operational)

  4. resolved Jun 05, 2026, 06:42 AM UTC

    Status: Resolved A fix has been deployed to restore compatibility with recent Azure AKS node bootstrap changes. We have not observed any additional affected customer clusters, and node provisioning is operating normally. If you experience any issues with node provisioning, please contact our Support team for assistance. Affected components EU Node Autoscaler (Operational) Node Autoscaler (Operational)

Read the full incident report →

Minor May 27, 2026

Documentation

Detected by Pingoru
May 27, 2026, 05:38 PM UTC
Resolved
May 27, 2026, 05:38 PM UTC
Duration
Timeline · 2 updates
  1. monitoring May 27, 2026, 05:12 PM UTC

    Status: Monitoring Cast AI documentation is inaccessible as a result of the ReadMe outage. Please follow their status page for updates: https://www.readmestatus.com/

  2. resolved May 27, 2026, 05:38 PM UTC

    Status: Resolved Cast AI documentation is now accessible. ReadMe reports their incident is resolved. https://www.readmestatus.com/

Read the full incident report →

Minor May 13, 2026

Cloud Provider Degradation

Detected by Pingoru
May 13, 2026, 07:57 PM UTC
Resolved
May 13, 2026, 07:57 PM UTC
Duration
Affected: Cloud Provider
Timeline · 1 update
  1. resolved May 13, 2026, 07:57 PM UTC

    Status: Resolved We are continuing to investigate. Further updates to follow. All CAST AI automation is back to normal. Affected components Cloud Provider (Operational)

Read the full incident report →

Minor May 12, 2026

Cloud Provider Outage

Detected by Pingoru
May 12, 2026, 09:49 PM UTC
Resolved
May 12, 2026, 09:49 PM UTC
Duration
Timeline · 2 updates
  1. investigating May 12, 2026, 09:46 PM UTC

    Status: Investigating AWS us-east-1 capacity constraints affecting new node provisioning. This is a cloud provider infrastructure issue — CAST AI services are operational. Provisioning will resume automatically once capacity is restored. Affected components Cloud Provider (Degraded performance)

  2. resolved May 12, 2026, 09:49 PM UTC

    Status: Resolved Test resolved.

Read the full incident report →

Minor April 30, 2026

Degraded Performance – CAST AI Autoscaler

Detected by Pingoru
Apr 30, 2026, 12:43 PM UTC
Resolved
Apr 30, 2026, 12:43 PM UTC
Duration
Affected: Node Autoscaler
Timeline · 2 updates
  1. investigating Apr 30, 2026, 12:03 PM UTC

    Status: Investigating We are currently experiencing degraded performance with the CAST AI Autoscaler. During this time, the autoscaler may be unable to provision new nodes, which can impact workload scheduling and scaling. Our engineering team is actively investigating the issue and working towards a resolution. We will provide updates as more information becomes available Affected components Node Autoscaler (Degraded performance)

  2. resolved Apr 30, 2026, 12:43 PM UTC

    Status: Resolved Update: The issue affecting the CAST AI Autoscaler has been resolved. The autoscaler is now operating normally, and node provisioning has been fully restored. Affected components Node Autoscaler (Operational)

Read the full incident report →

Minor April 24, 2026

Microsoft Azure VM Provisioning Issues in East US Region

Detected by Pingoru
Apr 24, 2026, 09:46 PM UTC
Resolved
Apr 24, 2026, 09:46 PM UTC
Duration
Affected: Node Autoscaler
Timeline · 2 updates
  1. monitoring Apr 24, 2026, 04:32 PM UTC

    Status: Monitoring Microsoft reports an active incident affecting VM provisioning in the Azure East US region. For updates please watch the Azure status page: https://azure.status.microsoft/en-us/status Affected components Node Autoscaler (Degraded performance)

  2. resolved Apr 24, 2026, 09:46 PM UTC

    Status: Resolved Autoscaler node provisioning for AKS clusters in East US is working as expected from a Cast AI perspective. We are declaring our impact as resolved but will continue actively monitoring the situation. Microsoft reports progress but the incident in Azure has not been declared as resolved. Please continue monitoring the Azure status page for future updates. https://azure.status.microsoft/en-us/status Affected components Node Autoscaler (Operational)

Read the full incident report →

Minor February 3, 2026

Ongoing issue with Azure AKS Clusters

Detected by Pingoru
Feb 03, 2026, 06:54 AM UTC
Resolved
Feb 03, 2026, 06:54 AM UTC
Duration
Affected: EU Node AutoscalerNode Autoscaler
Timeline · 5 updates
  1. investigating Feb 02, 2026, 08:15 PM UTC

    Status: Investigating We’re currently observing an increased rate of node add failures and long add node operations, affecting Azure AKS clusters. Our engineering team is actively investigating and will provide updates as more information becomes available. Affected components EU Node Autoscaler (Partial outage) Node Autoscaler (Partial outage)

  2. investigating Feb 02, 2026, 09:16 PM UTC

    Status: Investigating Update: Our investigation indicates this may be related to an Azure service outage impacting node creation and addition. We are awaiting further updates from Azure. As a precaution, we recommend disabling node deletion policies and scheduled rebalancing for critical clusters to avoid triggering downscaling. We’ll continue to share updates as more information becomes available. Reference: https://downdetector.com/status/windows-azure/ Affected components EU Node Autoscaler (Partial outage) Node Autoscaler (Partial outage)

  3. identified Feb 02, 2026, 09:39 PM UTC

    Status: Identified Update: Azure has reported issues with virtual machine service management across multiple regions following a recent configuration change. We’re actively monitoring the situation and will continue to provide updates as more information becomes available. Reference: https://azure.status.microsoft/en-us/status Affected components EU Node Autoscaler (Partial outage) Node Autoscaler (Partial outage)

  4. monitoring Feb 02, 2026, 11:05 PM UTC

    Status: Monitoring Update: Azure has released a fix and is rolling it out across all impacted regions. On the CAST AI side, we are observing node operations completing successfully, with nodes being added and provisioning as expected. We will provide another update once Azure confirms full resolution. Reference: https://azure.status.microsoft/en-us/status Affected components EU Node Autoscaler (Operational) Node Autoscaler (Operational)

  5. resolved Feb 03, 2026, 06:54 AM UTC

    Status: Resolved System should be fully recovered and node additional is operational. Resolution confirmed. Affected components EU Node Autoscaler (Operational) Node Autoscaler (Operational)

Read the full incident report →