Astronomer Outage History

Astronomer is up right now

Astronomer had 60 outages in the last 2 years totaling 67h 38m of downtime — averaging 2.5 incidents per month.

There were 60 Astronomer outages since April 18, 2025 totaling 67h 38m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.astronomer.io

Minor June 16, 2026

Azure clusters may experience issues with worker scale up and task pod scheduling

Detected by Pingoru
Jun 16, 2026, 12:05 AM UTC
Resolved
Jun 16, 2026, 12:53 AM UTC
Duration
47m
Timeline · 4 updates
  1. identified Jun 16, 2026, 12:05 AM UTC

    **The issue, according to Azure status page** Investigating a spike in 401 authentication errors We are actively investigating customer alerts of 401 authentication errors when pulling images from mcr.microsoft.com. We've identified a potential underlying factor related to a recent update made to a backed service and we are actively working on determining mitigation workstreams. More information will be provided shortly. This message was last updated at 23:04 UTC on 15 June 2026

  2. monitoring Jun 16, 2026, 12:05 AM UTC

    A fix has been implemented and we are monitoring the results.

  3. investigating Jun 16, 2026, 12:05 AM UTC

    We are currently investigating this issue.

  4. resolved Jun 16, 2026, 01:07 AM UTC

    This incident has been resolved.

Read the full incident report →

Notice June 15, 2026

astro deploys not reflecting on certain Astro clusters

Detected by Pingoru
Jun 15, 2026, 04:23 PM UTC
Resolved
Jun 15, 2026, 05:51 PM UTC
Duration
1h 27m
Timeline · 3 updates
  1. identified Jun 15, 2026, 04:23 PM UTC

    We've identified an issue with `astro deploy`s not updating manifests in Astro Hosted clusters, and we're currently applying a workaround and are continuing to monitor the situation.

  2. monitoring Jun 15, 2026, 04:54 PM UTC

    We are monitoring the affect of the applied fix and are observing error rates coming down/ceasing for affected clusters.

  3. resolved Jun 15, 2026, 05:51 PM UTC

    Mitigation was effective and all affected clusters are healthy.

Read the full incident report →

Minor May 8, 2026

Increased Task Failures in US-East-1 (AWS) - AWS Incident

Detected by Pingoru
May 08, 2026, 08:06 AM UTC
Resolved
May 09, 2026, 01:43 AM UTC
Duration
17h 37m
Timeline · 3 updates
  1. monitoring May 08, 2026, 08:06 AM UTC

    We are continuing to monitor the situation as AWS works toward full recovery in the affected Availability Zone. At this time, we are seeing signs of stabilization, and most transient failures should continue to self-resolve. We will provide further updates once AWS confirms that the issue has been fully resolved.

  2. investigating May 08, 2026, 08:06 AM UTC

    We are currently observing elevated task failures and latency for some deployments running in the AWS us-east-1 region. This is related to an ongoing AWS incident affecting a single Availability Zone (use1-az4), where EC2 and EBS resources have experienced impairments. What to expect: You may see intermittent task failures or retries in your deployments. In most cases, these failures are transient and should self-resolve automatically as AWS continues recovery. What you should do: No immediate action is required. However, if you notice consistent or prolonged failures, please reach out to Astronomer Support, and we’ll help investigate further. We are continuing to monitor the situation closely and will share updates as needed. Refer: https://health.aws.amazon.com/health/status

  3. resolved May 09, 2026, 02:07 AM UTC

    AWS has confirmed recovery in the affected Availability Zone (use1-az4), and system performance has returned to normal levels. We are no longer observing elevated task failures or latency in the AWS us-east-1 region. Any previously impacted workloads should now be operating as expected. If you continue to experience issues, please reach out to Astronomer Support for assistance.

Read the full incident report →

Minor May 7, 2026

Delayed delivery for some time-based alerts

Detected by Pingoru
May 07, 2026, 11:08 AM UTC
Resolved
May 07, 2026, 02:24 PM UTC
Duration
3h 16m
Timeline · 5 updates
  1. investigating May 07, 2026, 11:08 AM UTC

    We are investigating delayed delivery for some time-based alerts. The delay is primarily visible for DAG Timeliness alerts, and may also affect Observe-related SLA, Proactive SLA, and Data Quality monitor alerts. DAG Duration and Task Duration alerts do not appear to be affected at this time.

  2. identified May 07, 2026, 12:08 PM UTC

    We have identified the cause of the delayed delivery for some time-based alerts and are implementing a fix. We will provide another update once the fix has been deployed and validated.

  3. identified May 07, 2026, 01:06 PM UTC

    A fix has been prepared and is moving through deployment. We will provide another update once the rollout is complete and validation is underway.

  4. monitoring May 07, 2026, 02:07 PM UTC

    A fix has been implemented, and we are monitoring the results.

  5. resolved May 07, 2026, 03:08 PM UTC

    The incident has been resolved.

Read the full incident report →

Major May 6, 2026

Deployments on Astronomer Runtime 3.2-3 are returning 403 errors when accessing dags page.

Detected by Pingoru
May 06, 2026, 08:05 PM UTC
Resolved
May 06, 2026, 08:05 PM UTC
Duration
Timeline · 2 updates
  1. investigating May 06, 2026, 08:05 PM UTC

    Attempting to access the dags page in the Airflow UI results in a 403 Forbidden error. This should not be affecting task execution.

  2. resolved May 06, 2026, 09:05 PM UTC

    Our team has determined this only affects internal access, Astro end users are unaffected.

Read the full incident report →

Major May 4, 2026

Dashboard cost breakdown data delays

Detected by Pingoru
May 04, 2026, 11:09 AM UTC
Resolved
May 04, 2026, 08:15 PM UTC
Duration
9h 6m
Timeline · 3 updates
  1. investigating May 04, 2026, 11:09 AM UTC

    We’re investigating an issue affecting Dashboard cost breakdown data. For affected customers, cost breakdown information may appear stale and may not have updated since May 1, 2026. Our team is actively investigating the cause and working to restore current data. We’ll share another update as we have more information.

  2. investigating May 04, 2026, 11:09 AM UTC

    We are continuing to investigate this issue.

  3. resolved May 04, 2026, 09:08 PM UTC

    This incident is now resolved, and all Cost Breakdown data is now up to date. The issue that caused the delay has been fixed and should not recur.

Read the full incident report →

Major April 27, 2026

Airflow UI showing 403s for some customers

Detected by Pingoru
Apr 27, 2026, 10:06 PM UTC
Resolved
Apr 28, 2026, 12:16 AM UTC
Duration
2h 9m
Timeline · 3 updates
  1. investigating Apr 27, 2026, 10:06 PM UTC

    It seems to be only in AWS clusters for now. We have a workaround we can apply while we investigate.

  2. investigating Apr 27, 2026, 10:06 PM UTC

    We are continuing to investigate this issue.

  3. resolved Apr 28, 2026, 01:06 AM UTC

    We rolled back a change we made to our authentication system. Any image pushes or config changes to a deployment that occurred after our rollback caused the deployment to fix itself, which is why so many people found their 403s resolved themselves after some time.

Read the full incident report →

Minor April 24, 2026

Azure East US multiservice outage impacting Astro deployments in the region

Detected by Pingoru
Apr 24, 2026, 05:28 PM UTC
Resolved
Apr 25, 2026, 05:05 AM UTC
Duration
11h 37m
Timeline · 3 updates
  1. identified Apr 24, 2026, 05:28 PM UTC

    Azure East US has reported multi-service impact that is affecting Astro deployments in the region. For more information on Azure outage, please visit: https://azure.status.microsoft/en-us/status

  2. monitoring Apr 25, 2026, 04:08 AM UTC

    Azure has fixed the issue, and services are back to normal in East US. We are monitoring to make sure everything stays stable.

  3. resolved Apr 25, 2026, 05:09 AM UTC

    Azure has confirmed the issue is resolved and services are back to normal. This incident is now closed. For more details, see the Azure incident history: https://azure.status.microsoft/en-us/status/history/

Read the full incident report →

Minor April 17, 2026

Terminating Workers Accepting Tasks Causing Failures in Astro Executor

Detected by Pingoru
Apr 17, 2026, 08:08 AM UTC
Resolved
Apr 17, 2026, 11:47 AM UTC
Duration
3h 39m
Timeline · 4 updates
  1. investigating Apr 17, 2026, 08:08 AM UTC

    We have identified an issue with Astro Executor deployments where terminating workers can continue to accept new tasks, which can, in some cases, can cause these tasks to fail.

  2. identified Apr 17, 2026, 08:08 AM UTC

    We have identified the issue and we are rolling out the fix for the affected deployments.

  3. monitoring Apr 17, 2026, 10:08 AM UTC

    The fix has been implemented, and we are now monitoring the deployments.

  4. resolved Apr 17, 2026, 12:07 PM UTC

    This issue has been resolved.

Read the full incident report →

Major April 16, 2026

Runtime 3.2-1 Yanked - Incompatible with Env Manager

Detected by Pingoru
Apr 16, 2026, 06:06 PM UTC
Resolved
Apr 16, 2026, 10:25 PM UTC
Duration
4h 18m
Timeline · 2 updates
  1. identified Apr 16, 2026, 06:06 PM UTC

    We have identified that Runtime 3.2-1 is incompatible with the Astro Environment Manager. Any Connections or Variables stored at the Workspace level will not be available on deployments running 3.2-1. For this reason, we have disallowed the use of 3.2-1 for any deployments which are not already on that version. We are working to release a 3.2-2 version that is properly compatible as quickly as possible.

  2. resolved Apr 16, 2026, 11:06 PM UTC

    Astronomer Runtime 3.2-2 has been released with the fix.

Read the full incident report →

Notice April 13, 2026

Degraded service for some deployments in Azure West Europe

Detected by Pingoru
Apr 13, 2026, 10:07 PM UTC
Resolved
Apr 13, 2026, 10:11 PM UTC
Duration
4m
Timeline · 3 updates
  1. identified Apr 13, 2026, 10:07 PM UTC

    We are currently investigating an issue affecting deployments in our Azure West Europe region. Some customer deployments are experiencing degraded performance due to a compute resource constraint in our shared infrastructure. Our engineering team has identified that the region has reached its vCPU quota limit for a specific compute type, which is preventing new resources from being provisioned. We have opened a high-priority support request with Azure to increase this quota and are actively working with our Azure representative to expedite the approval. We will provide updates as soon as we have more information about the timeline for resolution. We apologize for any disruption this may cause to your service.

  2. identified Apr 13, 2026, 11:05 PM UTC

    We are continuing to work on a fix for this issue.

  3. resolved Apr 13, 2026, 11:05 PM UTC

    **RESOLVED** – Azure West Europe Cluster Capacity Issue We have resolved a capacity issue affecting our Azure West Europe (westeurope) shared cluster that occurred between 1:34 PM and 3:09 PM PDT on April 13. During this window, the cluster reached its vCPU quota, causing some deployments to become unhealthy. Our team immediately engaged Azure support and provisioned additional capacity as a temporary workaround, allowing all workloads to resume scheduling. All services are now fully operational and healthy. We're working with Azure on a permanent quota increase to prevent recurrence. We apologize for any disruption. If you experienced issues, please contact our support team.

Read the full incident report →

Minor April 12, 2026

Astro Alerts Degraded Performance

Detected by Pingoru
Apr 12, 2026, 10:56 PM UTC
Resolved
Apr 13, 2026, 11:12 AM UTC
Duration
12h 15m
Timeline · 4 updates
  1. investigating Apr 12, 2026, 10:56 PM UTC

    We are currently investigating an issue causing false positive alerts via Astro Alerts. Our team is actively investigating the issue.

  2. identified Apr 13, 2026, 12:18 AM UTC

    We have identified the issue with Astro Alert's degraded performance and currently working on a fix.

  3. monitoring Apr 13, 2026, 12:40 AM UTC

    We have applied a fix for the Astro Alerts degraded performance and currently monitoring it.

  4. resolved Apr 13, 2026, 12:07 PM UTC

    The issue has been resolved. Over the weekend, we observed a degradation in alert performance. With the fix now applied, any missed alerts are expected to be delivered now, and new alerts will be triggered without latency. Alerting has returned to normal, and there is no ongoing delay. All alerts should now fire as expected.

Read the full incident report →

Notice April 2, 2026

AWS Outage

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Dec 08, 2021, 12:53 PM UTC
Duration
Timeline · 2 updates
  1. identified Apr 02, 2026, 10:33 PM UTC

    There is an ongoing AWS outage, specifically in the us-east-1 region, as reported here: https://status.aws.amazon.com/ This outage is limiting our ability to troubleshoot and mitigate issues within customer clusters and Airflow deployments that are hosted in this region. All health indicators show these clusters and deployments remain healthy at this time, but we recommend not making any changes until the outage is over.

  2. resolved Apr 02, 2026, 10:33 PM UTC

    This incident has been resolved.

Read the full incident report →

Major April 2, 2026

Astronomer Cloud Outage

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Dec 16, 2021, 03:28 PM UTC
Duration
Timeline · 3 updates
  1. resolved Apr 02, 2026, 10:33 PM UTC

    This incident has been resolved.

  2. investigating Apr 02, 2026, 10:33 PM UTC

    We are currently investigating an issue with the Astronomer Cloud UI and API. User impact: The interface may be intermittently available at this time, and Airflow UIs appear to be inaccessible due to the inability to authenticate users through the API. Given the API's role in Deployment and Cluster management, those components are likely to be unresponsive during this time. Please stand by for updates. If you have any immediate questions, please reach out in Slack or open a ticket at support.astronomer.io.

  3. monitoring Apr 02, 2026, 10:33 PM UTC

    At this time, all components appear to be back online and operating as expected. We're going to monitor for a while before fully resolving this incident.

Read the full incident report →

Major April 2, 2026

Image Registry Unavailable

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Dec 09, 2021, 09:42 AM UTC
Duration
Timeline · 5 updates
  1. investigating Apr 02, 2026, 10:33 PM UTC

    The issue with our image repository is still being investigated, but most customers should be able to resume normal operations. Please subscribe for further updates.

  2. investigating Apr 02, 2026, 10:33 PM UTC

    We are continuing to investigate this issue.

  3. investigating Apr 02, 2026, 10:33 PM UTC

    Our image repository is currently unavailable. We're investigating the issue and will update as we know more.

  4. resolved Apr 02, 2026, 10:33 PM UTC

    A fix is applied where we added a new query in astro API that retrieves deployments and updated the docker registry auth proxy to use that query

  5. investigating Apr 02, 2026, 10:33 PM UTC

    Fix has been applied for docker-registry auth proxy.

Read the full incident report →

Major April 2, 2026

API Key / Access Token Issue

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Jan 14, 2022, 07:32 PM UTC
Duration
Timeline · 4 updates
  1. resolved Apr 02, 2026, 10:33 PM UTC

    This incident has been resolved.

  2. identified Apr 02, 2026, 10:33 PM UTC

    The fix has been applied. Things should be working as expected. We'll continue to monitor for the time being, but don't hesitate to reach out if you run into any issues.

  3. investigating Apr 02, 2026, 10:33 PM UTC

    We are investigating an issue with API Keys / Access Tokens. Impact: CI/CD pipelines will likely fail Workaround: Deploy using the astro CLI tool, if possible

  4. identified Apr 02, 2026, 10:33 PM UTC

    We have identified the issue and are working on applying the fix.

Read the full incident report →

Major April 2, 2026

Deployments and Login Instability

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Dec 16, 2021, 11:49 PM UTC
Duration
Timeline · 3 updates
  1. investigating Apr 02, 2026, 10:33 PM UTC

    We have observed long and unstable logins to the Cloud UI. Additionally, we see that deployments of new images are not working at least some of the time. We believe both of these to be related to the same root cause and are investigating.

  2. monitoring Apr 02, 2026, 10:33 PM UTC

    The issue was caused by a slowdown from Auth0, which has now passed. All services appear to be functioning properly again.

  3. resolved Apr 02, 2026, 10:33 PM UTC

    This incident has been resolved.

Read the full incident report →

Major April 2, 2026

New images not being deployed

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Mar 11, 2022, 06:00 AM UTC
Duration
Timeline · 5 updates
  1. resolved Apr 02, 2026, 10:33 PM UTC

    This incident has been resolved.

  2. identified Apr 02, 2026, 10:33 PM UTC

    While repairing the system, some Fernet keys (used to encrypt passwords in the Airflow connection configuration and variable configuration) are being restored. During this time, customers may see errors such as "could not create Fernet object: Incorrect padding", as well as variables or connections being marked as Invalid. The keys are currently being restored, and this error should disappear soon.

  3. identified Apr 02, 2026, 10:33 PM UTC

    This issue is continuing to affect customers wishing to update their airflow deployments. The Astronomer team have identified the underlying issue and is actively working to deliver a fix.

  4. monitoring Apr 02, 2026, 10:33 PM UTC

    All Airflow deployments are now operating normally. We continue to monitor the issue but do not expect any further service interruptions.

  5. identified Apr 02, 2026, 10:33 PM UTC

    An issue has been identified where images are successfully built and pushed, but the deployments are not being updated with the new images. This is intermittently affecting both CI/CD builds as well as builds triggered by the astro CLI tool. Astronomer will be monitoring and manually updating deployments with out of date images, but if you notice an image not being deployed, please contact support.

Read the full incident report →

Minor April 2, 2026

Deployment Status is broken

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Mar 31, 2022, 03:17 AM UTC
Duration
Timeline · 5 updates
  1. investigating Apr 02, 2026, 10:33 PM UTC

    We are currently investigating an issue where all deployments are being shown UnHealthy even if they are not. Please go ahead with opening airflow web UI even though it says "Airflow Unavailable" Any of existing deployments or updates are NOT affected. Only the status info is affected.

  2. identified Apr 02, 2026, 10:33 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Apr 02, 2026, 10:33 PM UTC

    The deployment statuses should be turning Healthy if they are anytime now

  4. resolved Apr 02, 2026, 10:33 PM UTC

    The incident has been resolved. All the deployments should be able to display correct Health status

  5. identified Apr 02, 2026, 10:33 PM UTC

    We are continuing to work on a fix for this issue.

Read the full incident report →

Major April 2, 2026

Issues with Astro-Runtime v5.0.0 deployments

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
May 03, 2022, 12:41 AM UTC
Duration
Timeline · 4 updates
  1. identified Apr 02, 2026, 10:33 PM UTC

    We are continuing to work on a fix for this issue.

  2. resolved Apr 02, 2026, 10:33 PM UTC

    Full functionality has been restored. Astro Runtime 5.0.0, which contains Airflow 2.3.0 is once again available for use on Astro.

  3. identified Apr 02, 2026, 10:33 PM UTC

    Any new images that are build using Astro Runtime 5.0.0 will fail to launch workers on Astro. Customers who have pushed Astro Runtime 5.0.0 images will be contacted by our support team.

  4. identified Apr 02, 2026, 10:33 PM UTC

    We have identified the root cause of the issue and are working on a hotfix. We have disabled the selection of Astro Runtime 5.0.0 (and thus Airflow 2.3.0) until we can release that hotfix.

Read the full incident report →

Notice April 2, 2026

User Access Unavailable

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Jun 21, 2022, 07:45 AM UTC
Duration
Timeline · 1 update
  1. resolved Apr 02, 2026, 10:33 PM UTC

    Due to an underlying outage in a major edge network provider, Astro's authentication service was returning intermittent errors and delayed responses. As a result, user access to the Cloud UI and Airflow UIs in the data plane was impeded. Airflow deployments were not directly impacted and tasks were scheduled as expected during this period.

Read the full incident report →

Major April 2, 2026

Astro Authentication Unavailable For Some Users

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Jul 28, 2022, 06:53 PM UTC
Duration
Timeline · 2 updates
  1. resolved Apr 02, 2026, 10:33 PM UTC

    Service has been fully restored.

  2. investigating Apr 02, 2026, 10:33 PM UTC

    We are currently investigating an issue where some customers are not able to access https://cloud.astronomer.io. Pipeline and task executions are not affected.

Read the full incident report →

Critical April 2, 2026

Astro Platform UI Down

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Jul 09, 2022, 04:56 PM UTC
Duration
Timeline · 6 updates
  1. investigating Apr 02, 2026, 10:33 PM UTC

    The UI for cloud.astronomer.io is currently inaccessible and updates to deployments are currently not working. Both Airflow UI's and the Astronomer UI cannot be reached currently. Existing Airflow deployments will continue to operate, but they are not viewable in the browser.

  2. investigating Apr 02, 2026, 10:33 PM UTC

    We are continuing to investigate this issue.

  3. investigating Apr 02, 2026, 10:33 PM UTC

    We are currently investigating the issue.

  4. monitoring Apr 02, 2026, 10:33 PM UTC

    Deployments are now accurately reporting their status. Astro is back to normal operation.

  5. investigating Apr 02, 2026, 10:33 PM UTC

    Airflow and Astronomer UI are now available again and new deployments will function correctly now. Airflow deployments may show as unhealthy and you may see "Airflow Not Available" but this message is inaccurate. You can still click on the link to access your Airflow deployments. We are in the process of correcting this spurious "Airflow Not Available" warning.

  6. resolved Apr 02, 2026, 10:33 PM UTC

    After further monitoring, we have concluded that all systems are now operational and stable.

Read the full incident report →

Minor April 2, 2026

Worker size in Cloud UI not consistently respected

Detected by Pingoru
Apr 02, 2026, 10:33 PM UTC
Resolved
Aug 24, 2022, 05:43 PM UTC
Duration
Timeline · 3 updates
  1. resolved Apr 02, 2026, 10:33 PM UTC

    We’ve deployed an update that resolves the issue of worker size not being consistently respected when deploying code with the Astro CLI using API credentials.

  2. identified Apr 02, 2026, 10:33 PM UTC

    We have identified the cause and determined that this regression only occurs when deployments are pushed while logged in using an API key instead of a user account. This login method is common in CI/CD based deployments. We are testing a fix currently.

  3. investigating Apr 02, 2026, 10:33 PM UTC

    We are currently investigating the issue. For most customers, this issue should not be noticeable. For some deployment patterns, this issue may cause the Kubernetes autoscaler to scale down without respect for any currently running tasks which would kill those tasks and render them as zombies. As a temporary workaround, each time you trigger a deployment using `astro deploy`, please also update your Worker Queue(s) from the Astro UI by selecting your Workspace, then selecting your Deployment, then clicking "Edit" on your Default Worker Queue, then clicking the "Update Queue" button.

Read the full incident report →

Major April 1, 2026

Deployment changes fail due to an unrelated "hibernation" error

Detected by Pingoru
Apr 01, 2026, 06:31 PM UTC
Resolved
Apr 01, 2026, 07:49 PM UTC
Duration
1h 18m
Affected: Deployment ManagementDeployment Management
Timeline · 3 updates
  1. identified Apr 01, 2026, 06:31 PM UTC

    The issue has been identified and a fix is being implemented.

  2. identified Apr 01, 2026, 06:32 PM UTC

    We are continuing to work on a fix for this issue.

  3. resolved Apr 01, 2026, 07:49 PM UTC

    This incident has been resolved.

Read the full incident report →