Microsoft Azure incident
Active – Recovering - Multiple Services experiencing degradation in West US 2
Microsoft Azure experienced a major incident on May 29, 2026 affecting Azure Kubernetes Service (AKS) — West US 2 and Virtual Machine Scale Sets — West US 2 and 1 more component, lasting 17h 18m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- monitoring May 29, 2026, 04:27 AM UTC
Impact statement: Starting at 04:27 UTC on 29 May 2026, a severe thunderstorm caused widespread utility power loss across multiple West US 2 datacenter facilities, resulting in impact to multiple Azure services. Datacenter power has been fully restored, and the majority of network and storage infrastructure has recovered. Infrastructure in Availability Zone 2 has recovered and is operating normally. Residual impact is now concentrated on a subset of infrastructure within Availability Zones 1 and 3, where two storage stamps are undergoing final recovery and validation. Customers with resources deployed exclusively in Availability Zone 2 should not be affected by the remaining recovery activities; however, those with dependencies in Availability Zones 1 or 3 may continue to experience intermittent connectivity, elevated latency, or resource unavailability. Currently affected Azure services include, but may not be limited to (recovery validation underway):Azure Functions, Azure Database for MySQL Flexible Server, Azure Database for PostgreSQL Flexible Server, Azure SQL, Azure Managed Grafana, Virtual Machines, Virtual Machine Scale Sets, Azure Kubernetes Service, Azure Storage, Application Insights, Azure Monitor, and Azure Log Analytics.The following services have recovered and are operating normally:Service Bus, App Service (Web Apps), Azure Site Recovery, Backup (MAB), Azure Cosmos DB, Azure Resource Manager, Azure Data Explorer, Azure IoT Hub, Microsoft Defender for Cloud Apps, Azure Container Registry, Azure Policy, Azure NetApp Files, Azure Resource Graph, Azure Data Factory, Azure Databricks, Azure Cache for Redis, and Azure Synapse Analytics. Current Status: A severe weather event, including multiple lightning strikes, affected several datacenter facilities in the West US 2 region, resulting in concurrent utility power interruptions across multiple availability zones. Backup generators activated as designed; however, during the transition to sustained generator operation, a subset of generator systems were unable to fully synchronize under the sudden facility load, while others subsequently shut down due to thermal protection mechanisms as cooling systems were impacted by the broader power disruption. Because these events occurred across multiple facilities within the same timeframe, the resulting conditions exceeded the resiliency designed for this particular failure scenario. Datacenter utility power has been fully restored. HVAC systems have normalized, ambient temperatures have returned to expected operating ranges, and the majority of network and storage infrastructure has recovered successfully. While most services have returned to normal operation, a small subset of storage clusters and network devices did not recover cleanly and continue to limit full-service restoration. Some storage clusters remain partially unavailable or dependent on ongoing network recovery activities, and a limited number of network devices connecting compute resources to storage infrastructure remain impaired. As a result, some virtual machines are currently unable to access their underlying storage dependencies and may remain unavailable. Recovery efforts have transitioned from broad infrastructure restoration to targeted remediation of specific network devices, storage clusters, and isolated compute nodes. As these components are restored, validated, and returned to service, dependent workloads are recovering incrementally. Two storage stamps are currently completing final recovery and integrity validation activities, and services dependent on those stamps are expected to recover as this work is completed. Estimated Time to Resolution: With datacenter power restored and the majority of network and storage infrastructure recovered, remaining work is focused on two storage stamps completing final recovery and data integrity validation activities. Based on current progress, we estimate full-service restoration within approximately 2-3 hours. Individual services dependent on these stamps will recover progressively as validation activities are completed. Customer Guidance:Single-region workloads will recover automatically as infrastructure restoration activities are completed; no customer action is required.We recommend postponing new deployments into West US 2 until this incident is resolved and using alternate regions for urgent deployment needs.Logical availability zones assigned to customer subscriptions may map to different physical availability zones. Customers can use the Locations API to understand this mapping: https://learn.microsoft.com/rest/api/resources/subscriptions/list-locations?HTTP#availabilityzonemappings. Next Update: We will provide our next status update within 60 minutes, or sooner if significant progress is made.