Astronomer Outage History

Astronomer had 60 outages in the last 2 years totaling 67h 38m of downtime — averaging 2.5 incidents per month.

There were 60 Astronomer outages since April 18, 2025 totaling 67h 38m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.astronomer.io

Minor June 16, 2026

Azure clusters may experience issues with worker scale up and task pod scheduling

Detected by Pingoru: Jun 16, 2026, 12:05 AM UTC
Resolved: Jun 16, 2026, 12:53 AM UTC
Duration: 47m

Timeline · 4 updates

identified Jun 16, 2026, 12:05 AM UTC

**The issue, according to Azure status page** Investigating a spike in 401 authentication errors We are actively investigating customer alerts of 401 authentication errors when pulling images from mcr.microsoft.com. We've identified a potential underlying factor related to a recent update made to a backed service and we are actively working on determining mitigation workstreams. More information will be provided shortly. This message was last updated at 23:04 UTC on 15 June 2026
monitoring Jun 16, 2026, 12:05 AM UTC

A fix has been implemented and we are monitoring the results.
investigating Jun 16, 2026, 12:05 AM UTC

We are currently investigating this issue.
resolved Jun 16, 2026, 01:07 AM UTC

This incident has been resolved.

Read the full incident report →

Notice June 15, 2026

astro deploys not reflecting on certain Astro clusters

Detected by Pingoru: Jun 15, 2026, 04:23 PM UTC
Resolved: Jun 15, 2026, 05:51 PM UTC
Duration: 1h 27m

Timeline · 3 updates

identified Jun 15, 2026, 04:23 PM UTC

We've identified an issue with `astro deploy`s not updating manifests in Astro Hosted clusters, and we're currently applying a workaround and are continuing to monitor the situation.
monitoring Jun 15, 2026, 04:54 PM UTC

We are monitoring the affect of the applied fix and are observing error rates coming down/ceasing for affected clusters.
resolved Jun 15, 2026, 05:51 PM UTC

Mitigation was effective and all affected clusters are healthy.

Read the full incident report →

Minor May 8, 2026

Increased Task Failures in US-East-1 (AWS) - AWS Incident

Detected by Pingoru: May 08, 2026, 08:06 AM UTC
Resolved: May 09, 2026, 01:43 AM UTC
Duration: 17h 37m

Timeline · 3 updates

monitoring May 08, 2026, 08:06 AM UTC

We are continuing to monitor the situation as AWS works toward full recovery in the affected Availability Zone. At this time, we are seeing signs of stabilization, and most transient failures should continue to self-resolve. We will provide further updates once AWS confirms that the issue has been fully resolved.
investigating May 08, 2026, 08:06 AM UTC

We are currently observing elevated task failures and latency for some deployments running in the AWS us-east-1 region. This is related to an ongoing AWS incident affecting a single Availability Zone (use1-az4), where EC2 and EBS resources have experienced impairments. What to expect: You may see intermittent task failures or retries in your deployments. In most cases, these failures are transient and should self-resolve automatically as AWS continues recovery. What you should do: No immediate action is required. However, if you notice consistent or prolonged failures, please reach out to Astronomer Support, and we’ll help investigate further. We are continuing to monitor the situation closely and will share updates as needed. Refer: https://health.aws.amazon.com/health/status
resolved May 09, 2026, 02:07 AM UTC

AWS has confirmed recovery in the affected Availability Zone (use1-az4), and system performance has returned to normal levels. We are no longer observing elevated task failures or latency in the AWS us-east-1 region. Any previously impacted workloads should now be operating as expected. If you continue to experience issues, please reach out to Astronomer Support for assistance.

Read the full incident report →

Minor May 7, 2026

Delayed delivery for some time-based alerts

Detected by Pingoru: May 07, 2026, 11:08 AM UTC
Resolved: May 07, 2026, 02:24 PM UTC
Duration: 3h 16m

Timeline · 5 updates

investigating May 07, 2026, 11:08 AM UTC

We are investigating delayed delivery for some time-based alerts. The delay is primarily visible for DAG Timeliness alerts, and may also affect Observe-related SLA, Proactive SLA, and Data Quality monitor alerts. DAG Duration and Task Duration alerts do not appear to be affected at this time.
identified May 07, 2026, 12:08 PM UTC

We have identified the cause of the delayed delivery for some time-based alerts and are implementing a fix. We will provide another update once the fix has been deployed and validated.
identified May 07, 2026, 01:06 PM UTC

A fix has been prepared and is moving through deployment. We will provide another update once the rollout is complete and validation is underway.
monitoring May 07, 2026, 02:07 PM UTC

A fix has been implemented, and we are monitoring the results.
resolved May 07, 2026, 03:08 PM UTC

The incident has been resolved.

Read the full incident report →

Major May 6, 2026

Deployments on Astronomer Runtime 3.2-3 are returning 403 errors when accessing dags page.

Detected by Pingoru: May 06, 2026, 08:05 PM UTC
Resolved: May 06, 2026, 08:05 PM UTC
Duration: —

Timeline · 2 updates

investigating May 06, 2026, 08:05 PM UTC

Attempting to access the dags page in the Airflow UI results in a 403 Forbidden error. This should not be affecting task execution.
resolved May 06, 2026, 09:05 PM UTC

Our team has determined this only affects internal access, Astro end users are unaffected.

Read the full incident report →

Major May 4, 2026

Dashboard cost breakdown data delays

Detected by Pingoru: May 04, 2026, 11:09 AM UTC
Resolved: May 04, 2026, 08:15 PM UTC
Duration: 9h 6m

Timeline · 3 updates

investigating May 04, 2026, 11:09 AM UTC

We’re investigating an issue affecting Dashboard cost breakdown data. For affected customers, cost breakdown information may appear stale and may not have updated since May 1, 2026. Our team is actively investigating the cause and working to restore current data. We’ll share another update as we have more information.
investigating May 04, 2026, 11:09 AM UTC

We are continuing to investigate this issue.
resolved May 04, 2026, 09:08 PM UTC

This incident is now resolved, and all Cost Breakdown data is now up to date. The issue that caused the delay has been fixed and should not recur.

Read the full incident report →

Major April 27, 2026

Airflow UI showing 403s for some customers

Detected by Pingoru: Apr 27, 2026, 10:06 PM UTC
Resolved: Apr 28, 2026, 12:16 AM UTC
Duration: 2h 9m

Timeline · 3 updates

investigating Apr 27, 2026, 10:06 PM UTC

It seems to be only in AWS clusters for now. We have a workaround we can apply while we investigate.
investigating Apr 27, 2026, 10:06 PM UTC

We are continuing to investigate this issue.
resolved Apr 28, 2026, 01:06 AM UTC

We rolled back a change we made to our authentication system. Any image pushes or config changes to a deployment that occurred after our rollback caused the deployment to fix itself, which is why so many people found their 403s resolved themselves after some time.

Read the full incident report →

Minor April 24, 2026

Azure East US multiservice outage impacting Astro deployments in the region

Detected by Pingoru: Apr 24, 2026, 05:28 PM UTC
Resolved: Apr 25, 2026, 05:05 AM UTC
Duration: 11h 37m

Timeline · 3 updates

identified Apr 24, 2026, 05:28 PM UTC

Azure East US has reported multi-service impact that is affecting Astro deployments in the region. For more information on Azure outage, please visit: https://azure.status.microsoft/en-us/status
monitoring Apr 25, 2026, 04:08 AM UTC

Azure has fixed the issue, and services are back to normal in East US. We are monitoring to make sure everything stays stable.
resolved Apr 25, 2026, 05:09 AM UTC

Azure has confirmed the issue is resolved and services are back to normal. This incident is now closed. For more details, see the Azure incident history: https://azure.status.microsoft/en-us/status/history/

Read the full incident report →

Minor April 17, 2026

Terminating Workers Accepting Tasks Causing Failures in Astro Executor

Detected by Pingoru: Apr 17, 2026, 08:08 AM UTC
Resolved: Apr 17, 2026, 11:47 AM UTC
Duration: 3h 39m

Timeline · 4 updates

investigating Apr 17, 2026, 08:08 AM UTC

We have identified an issue with Astro Executor deployments where terminating workers can continue to accept new tasks, which can, in some cases, can cause these tasks to fail.
identified Apr 17, 2026, 08:08 AM UTC

We have identified the issue and we are rolling out the fix for the affected deployments.
monitoring Apr 17, 2026, 10:08 AM UTC

The fix has been implemented, and we are now monitoring the deployments.
resolved Apr 17, 2026, 12:07 PM UTC

This issue has been resolved.

Read the full incident report →

Major April 16, 2026

Runtime 3.2-1 Yanked - Incompatible with Env Manager

Detected by Pingoru: Apr 16, 2026, 06:06 PM UTC
Resolved: Apr 16, 2026, 10:25 PM UTC
Duration: 4h 18m

Timeline · 2 updates

identified Apr 16, 2026, 06:06 PM UTC

We have identified that Runtime 3.2-1 is incompatible with the Astro Environment Manager. Any Connections or Variables stored at the Workspace level will not be available on deployments running 3.2-1. For this reason, we have disallowed the use of 3.2-1 for any deployments which are not already on that version. We are working to release a 3.2-2 version that is properly compatible as quickly as possible.
resolved Apr 16, 2026, 11:06 PM UTC

Astronomer Runtime 3.2-2 has been released with the fix.

Read the full incident report →

Notice April 13, 2026

Degraded service for some deployments in Azure West Europe

Detected by Pingoru: Apr 13, 2026, 10:07 PM UTC
Resolved: Apr 13, 2026, 10:11 PM UTC
Duration: 4m

Timeline · 3 updates

identified Apr 13, 2026, 10:07 PM UTC

We are currently investigating an issue affecting deployments in our Azure West Europe region. Some customer deployments are experiencing degraded performance due to a compute resource constraint in our shared infrastructure. Our engineering team has identified that the region has reached its vCPU quota limit for a specific compute type, which is preventing new resources from being provisioned. We have opened a high-priority support request with Azure to increase this quota and are actively working with our Azure representative to expedite the approval. We will provide updates as soon as we have more information about the timeline for resolution. We apologize for any disruption this may cause to your service.
identified Apr 13, 2026, 11:05 PM UTC

We are continuing to work on a fix for this issue.
resolved Apr 13, 2026, 11:05 PM UTC

**RESOLVED** – Azure West Europe Cluster Capacity Issue We have resolved a capacity issue affecting our Azure West Europe (westeurope) shared cluster that occurred between 1:34 PM and 3:09 PM PDT on April 13. During this window, the cluster reached its vCPU quota, causing some deployments to become unhealthy. Our team immediately engaged Azure support and provisioned additional capacity as a temporary workaround, allowing all workloads to resume scheduling. All services are now fully operational and healthy. We're working with Azure on a permanent quota increase to prevent recurrence. We apologize for any disruption. If you experienced issues, please contact our support team.

Read the full incident report →

Minor April 12, 2026

Astro Alerts Degraded Performance

Detected by Pingoru: Apr 12, 2026, 10:56 PM UTC
Resolved: Apr 13, 2026, 11:12 AM UTC
Duration: 12h 15m

Timeline · 4 updates

investigating Apr 12, 2026, 10:56 PM UTC

We are currently investigating an issue causing false positive alerts via Astro Alerts. Our team is actively investigating the issue.
identified Apr 13, 2026, 12:18 AM UTC

We have identified the issue with Astro Alert's degraded performance and currently working on a fix.
monitoring Apr 13, 2026, 12:40 AM UTC

We have applied a fix for the Astro Alerts degraded performance and currently monitoring it.
resolved Apr 13, 2026, 12:07 PM UTC

The issue has been resolved. Over the weekend, we observed a degradation in alert performance. With the fix now applied, any missed alerts are expected to be delivered now, and new alerts will be triggered without latency. Alerting has returned to normal, and there is no ongoing delay. All alerts should now fire as expected.

Read the full incident report →

Notice April 2, 2026

AWS Outage

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Dec 08, 2021, 12:53 PM UTC
Duration: —

Timeline · 2 updates

identified Apr 02, 2026, 10:33 PM UTC

There is an ongoing AWS outage, specifically in the us-east-1 region, as reported here: https://status.aws.amazon.com/ This outage is limiting our ability to troubleshoot and mitigate issues within customer clusters and Airflow deployments that are hosted in this region. All health indicators show these clusters and deployments remain healthy at this time, but we recommend not making any changes until the outage is over.
resolved Apr 02, 2026, 10:33 PM UTC

This incident has been resolved.

Read the full incident report →

Major April 2, 2026

Astronomer Cloud Outage

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Dec 16, 2021, 03:28 PM UTC
Duration: —

Timeline · 3 updates

resolved Apr 02, 2026, 10:33 PM UTC

This incident has been resolved.
investigating Apr 02, 2026, 10:33 PM UTC

We are currently investigating an issue with the Astronomer Cloud UI and API. User impact: The interface may be intermittently available at this time, and Airflow UIs appear to be inaccessible due to the inability to authenticate users through the API. Given the API's role in Deployment and Cluster management, those components are likely to be unresponsive during this time. Please stand by for updates. If you have any immediate questions, please reach out in Slack or open a ticket at support.astronomer.io.
monitoring Apr 02, 2026, 10:33 PM UTC

At this time, all components appear to be back online and operating as expected. We're going to monitor for a while before fully resolving this incident.

Read the full incident report →

Major April 2, 2026

Image Registry Unavailable

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Dec 09, 2021, 09:42 AM UTC
Duration: —

Timeline · 5 updates

investigating Apr 02, 2026, 10:33 PM UTC

The issue with our image repository is still being investigated, but most customers should be able to resume normal operations. Please subscribe for further updates.
investigating Apr 02, 2026, 10:33 PM UTC

We are continuing to investigate this issue.
investigating Apr 02, 2026, 10:33 PM UTC

Our image repository is currently unavailable. We're investigating the issue and will update as we know more.
resolved Apr 02, 2026, 10:33 PM UTC

A fix is applied where we added a new query in astro API that retrieves deployments and updated the docker registry auth proxy to use that query
investigating Apr 02, 2026, 10:33 PM UTC

Fix has been applied for docker-registry auth proxy.

Read the full incident report →

Major April 2, 2026

API Key / Access Token Issue

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Jan 14, 2022, 07:32 PM UTC
Duration: —

Timeline · 4 updates

resolved Apr 02, 2026, 10:33 PM UTC

This incident has been resolved.
identified Apr 02, 2026, 10:33 PM UTC

The fix has been applied. Things should be working as expected. We'll continue to monitor for the time being, but don't hesitate to reach out if you run into any issues.
investigating Apr 02, 2026, 10:33 PM UTC

We are investigating an issue with API Keys / Access Tokens. Impact: CI/CD pipelines will likely fail Workaround: Deploy using the astro CLI tool, if possible
identified Apr 02, 2026, 10:33 PM UTC

We have identified the issue and are working on applying the fix.

Read the full incident report →

Major April 2, 2026

Deployments and Login Instability

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Dec 16, 2021, 11:49 PM UTC
Duration: —

Timeline · 3 updates

investigating Apr 02, 2026, 10:33 PM UTC

We have observed long and unstable logins to the Cloud UI. Additionally, we see that deployments of new images are not working at least some of the time. We believe both of these to be related to the same root cause and are investigating.
monitoring Apr 02, 2026, 10:33 PM UTC

The issue was caused by a slowdown from Auth0, which has now passed. All services appear to be functioning properly again.
resolved Apr 02, 2026, 10:33 PM UTC

This incident has been resolved.

Read the full incident report →

Major April 2, 2026

New images not being deployed

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Mar 11, 2022, 06:00 AM UTC
Duration: —

Timeline · 5 updates

resolved Apr 02, 2026, 10:33 PM UTC

This incident has been resolved.
identified Apr 02, 2026, 10:33 PM UTC

While repairing the system, some Fernet keys (used to encrypt passwords in the Airflow connection configuration and variable configuration) are being restored. During this time, customers may see errors such as "could not create Fernet object: Incorrect padding", as well as variables or connections being marked as Invalid. The keys are currently being restored, and this error should disappear soon.
identified Apr 02, 2026, 10:33 PM UTC

This issue is continuing to affect customers wishing to update their airflow deployments. The Astronomer team have identified the underlying issue and is actively working to deliver a fix.
monitoring Apr 02, 2026, 10:33 PM UTC

All Airflow deployments are now operating normally. We continue to monitor the issue but do not expect any further service interruptions.
identified Apr 02, 2026, 10:33 PM UTC

An issue has been identified where images are successfully built and pushed, but the deployments are not being updated with the new images. This is intermittently affecting both CI/CD builds as well as builds triggered by the astro CLI tool. Astronomer will be monitoring and manually updating deployments with out of date images, but if you notice an image not being deployed, please contact support.

Read the full incident report →

Minor April 2, 2026

Deployment Status is broken

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Mar 31, 2022, 03:17 AM UTC
Duration: —

Timeline · 5 updates

investigating Apr 02, 2026, 10:33 PM UTC

We are currently investigating an issue where all deployments are being shown UnHealthy even if they are not. Please go ahead with opening airflow web UI even though it says "Airflow Unavailable" Any of existing deployments or updates are NOT affected. Only the status info is affected.
identified Apr 02, 2026, 10:33 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Apr 02, 2026, 10:33 PM UTC

The deployment statuses should be turning Healthy if they are anytime now
resolved Apr 02, 2026, 10:33 PM UTC

The incident has been resolved. All the deployments should be able to display correct Health status
identified Apr 02, 2026, 10:33 PM UTC

We are continuing to work on a fix for this issue.

Read the full incident report →

Major April 2, 2026

Issues with Astro-Runtime v5.0.0 deployments

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: May 03, 2022, 12:41 AM UTC
Duration: —

Timeline · 4 updates

identified Apr 02, 2026, 10:33 PM UTC

We are continuing to work on a fix for this issue.
resolved Apr 02, 2026, 10:33 PM UTC

Full functionality has been restored. Astro Runtime 5.0.0, which contains Airflow 2.3.0 is once again available for use on Astro.
identified Apr 02, 2026, 10:33 PM UTC

Any new images that are build using Astro Runtime 5.0.0 will fail to launch workers on Astro. Customers who have pushed Astro Runtime 5.0.0 images will be contacted by our support team.
identified Apr 02, 2026, 10:33 PM UTC

We have identified the root cause of the issue and are working on a hotfix. We have disabled the selection of Astro Runtime 5.0.0 (and thus Airflow 2.3.0) until we can release that hotfix.

Read the full incident report →

Notice April 2, 2026

User Access Unavailable

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Jun 21, 2022, 07:45 AM UTC
Duration: —

Timeline · 1 update

resolved Apr 02, 2026, 10:33 PM UTC

Due to an underlying outage in a major edge network provider, Astro's authentication service was returning intermittent errors and delayed responses. As a result, user access to the Cloud UI and Airflow UIs in the data plane was impeded. Airflow deployments were not directly impacted and tasks were scheduled as expected during this period.

Read the full incident report →

Major April 2, 2026

Astro Authentication Unavailable For Some Users

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Jul 28, 2022, 06:53 PM UTC
Duration: —

Timeline · 2 updates

resolved Apr 02, 2026, 10:33 PM UTC

Service has been fully restored.
investigating Apr 02, 2026, 10:33 PM UTC

We are currently investigating an issue where some customers are not able to access https://cloud.astronomer.io. Pipeline and task executions are not affected.

Read the full incident report →

Critical April 2, 2026

Astro Platform UI Down

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Jul 09, 2022, 04:56 PM UTC
Duration: —

Timeline · 6 updates

investigating Apr 02, 2026, 10:33 PM UTC

The UI for cloud.astronomer.io is currently inaccessible and updates to deployments are currently not working. Both Airflow UI's and the Astronomer UI cannot be reached currently. Existing Airflow deployments will continue to operate, but they are not viewable in the browser.
investigating Apr 02, 2026, 10:33 PM UTC

We are continuing to investigate this issue.
investigating Apr 02, 2026, 10:33 PM UTC

We are currently investigating the issue.
monitoring Apr 02, 2026, 10:33 PM UTC

Deployments are now accurately reporting their status. Astro is back to normal operation.
investigating Apr 02, 2026, 10:33 PM UTC

Airflow and Astronomer UI are now available again and new deployments will function correctly now. Airflow deployments may show as unhealthy and you may see "Airflow Not Available" but this message is inaccurate. You can still click on the link to access your Airflow deployments. We are in the process of correcting this spurious "Airflow Not Available" warning.
resolved Apr 02, 2026, 10:33 PM UTC

After further monitoring, we have concluded that all systems are now operational and stable.

Read the full incident report →

Minor April 2, 2026

Worker size in Cloud UI not consistently respected

Detected by Pingoru: Apr 02, 2026, 10:33 PM UTC
Resolved: Aug 24, 2022, 05:43 PM UTC
Duration: —

Timeline · 3 updates

resolved Apr 02, 2026, 10:33 PM UTC

We’ve deployed an update that resolves the issue of worker size not being consistently respected when deploying code with the Astro CLI using API credentials.
identified Apr 02, 2026, 10:33 PM UTC

We have identified the cause and determined that this regression only occurs when deployments are pushed while logged in using an API key instead of a user account. This login method is common in CI/CD based deployments. We are testing a fix currently.
investigating Apr 02, 2026, 10:33 PM UTC

We are currently investigating the issue. For most customers, this issue should not be noticeable. For some deployment patterns, this issue may cause the Kubernetes autoscaler to scale down without respect for any currently running tasks which would kill those tasks and render them as zombies. As a temporary workaround, each time you trigger a deployment using `astro deploy`, please also update your Worker Queue(s) from the Astro UI by selecting your Workspace, then selecting your Deployment, then clicking "Edit" on your Default Worker Queue, then clicking the "Update Queue" button.

Read the full incident report →

Major April 1, 2026

Deployment changes fail due to an unrelated "hibernation" error

Detected by Pingoru: Apr 01, 2026, 06:31 PM UTC
Resolved: Apr 01, 2026, 07:49 PM UTC
Duration: 1h 18m

Affected: Deployment ManagementDeployment Management

Timeline · 3 updates

identified Apr 01, 2026, 06:31 PM UTC

The issue has been identified and a fix is being implemented.
identified Apr 01, 2026, 06:32 PM UTC

We are continuing to work on a fix for this issue.
resolved Apr 01, 2026, 07:49 PM UTC

This incident has been resolved.

Read the full incident report →