Astronomer Outage History

Astronomer is up right now

Astronomer had 48 outages in the last 2 years totaling 265h 6m of downtime — averaging 2 incidents per month.

There were 48 Astronomer outages since March 26, 2025 totaling 265h 6m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.astronomer.io

Major October 30, 2025

Deployment Health Incidents are not available

Detected by Pingoru
Oct 30, 2025, 12:58 AM UTC
Resolved
Oct 30, 2025, 04:27 AM UTC
Duration
3h 28m
Affected: Deployment ManagementDeployment Management
Timeline · 2 updates
  1. investigating Oct 30, 2025, 12:58 AM UTC

    https://www.astronomer.io/docs/astro/deployment-health-incidents

  2. resolved Oct 30, 2025, 04:27 AM UTC

    Deployment Health Incidents have been temporarily disabled.

Read the full incident report →

Minor October 29, 2025

Astro UI performance may be severely degraded or unavailable

Detected by Pingoru
Oct 29, 2025, 11:57 PM UTC
Resolved
Oct 30, 2025, 12:58 AM UTC
Duration
1h 1m
Affected: Deployment AccessDeployment AccessDeployment ManagementDeployment ManagementCloud UICloud UIAstro ObserveCloud APICloud APICloud Image RepositoryCloud Image RepositoryCluster ManagementDashboards and Analytics
Timeline · 4 updates
  1. investigating Oct 29, 2025, 11:57 PM UTC

    We are currently investigating this issue.

  2. investigating Oct 30, 2025, 12:51 AM UTC

    Deployment Health Incidents may not be working https://www.astronomer.io/docs/astro/deployment-health-incidents

  3. monitoring Oct 30, 2025, 12:51 AM UTC

    Deployment Health Incidents may not be working https://www.astronomer.io/docs/astro/deployment-health-incidents

  4. resolved Oct 30, 2025, 12:58 AM UTC

    This incident has been resolved.

Read the full incident report →

Critical October 29, 2025

Azure Front Door CDN issue causing timeouts to Astro Cloud UI

Detected by Pingoru
Oct 29, 2025, 04:05 PM UTC
Resolved
Oct 30, 2025, 12:49 AM UTC
Duration
8h 44m
Affected: Cloud UICloud UI
Timeline · 4 updates
  1. investigating Oct 29, 2025, 04:05 PM UTC

    We are currently investigating an apparent issue with Azure's Front Door CDN that is causing DNS timeouts and unavailability of the Astro Cloud UI. Airflow deployments are unaffected and continue to process tasks. We have updated our DNS routing to bypass Front Door for now and are seeing access to the Cloud UI being restored.

  2. monitoring Oct 29, 2025, 04:05 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. monitoring Oct 29, 2025, 04:35 PM UTC

    We are continuing to monitor for any further affects from this Azure outage. You can follow Azure's status here: https://azure.status.microsoft/en-us/status

  4. resolved Oct 30, 2025, 12:49 AM UTC

    Resolved per Azure status; https://azure.status.microsoft/en-us/status

Read the full incident report →

Minor October 28, 2025

Issues with EC2 scale up time in AWS us-east-1, use1-az2 Availability Zone causing node scale up issues for some deployment in AWs us-east-1

Detected by Pingoru
Oct 28, 2025, 06:05 PM UTC
Resolved
Oct 28, 2025, 11:51 PM UTC
Duration
5h 46m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 3 updates
  1. investigating Oct 28, 2025, 06:05 PM UTC

    We are currently investigating this issue and have been notified by AWS of issues with VM scale ups in AWS us-east-1, specifically the use1-az2 AZ. We are working to identify affected deployments, as well as on mitigations.

  2. monitoring Oct 28, 2025, 07:39 PM UTC

    Customer deployments should no longer be affected, we are monitoring to confirm resolution.

  3. resolved Oct 28, 2025, 11:51 PM UTC

    This incident has been resolved.

Read the full incident report →

Major October 20, 2025

AWS Outage Impacting Astro Deployments

Detected by Pingoru
Oct 20, 2025, 10:00 AM UTC
Resolved
Oct 20, 2025, 09:49 PM UTC
Duration
11h 48m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and TasksCloud UICloud UICluster Management
Timeline · 8 updates
  1. investigating Oct 20, 2025, 10:00 AM UTC

    We are aware of an ongoing AWS outage in the US-EAST-1 (N. Virginia) region that is impacting multiple AWS services and related infrastructure components. Customers with Astro clusters and deployments hosted on AWS may experience degraded performance, failed task executions, or delays in accessing their environments. Our team is actively monitoring the situation and assessing the impact across affected deployments. For real-time updates from AWS, please refer to their Service Health Dashboard - https://health.aws.amazon.com/health/status Next update will be provided as more information becomes available.

  2. investigating Oct 20, 2025, 10:00 AM UTC

    We are continuing to investigate this issue.

  3. investigating Oct 20, 2025, 01:08 PM UTC

    We’re aware that the Airflow UI has been running very slowly following the recent AWS outage. Our team is actively investigating the issue.

  4. investigating Oct 20, 2025, 02:27 PM UTC

    The AWS outage is affecting an internal tool, which is causing Airflow UI slowness in clusters running on all clouds (not just AWS). Our development team is working on a fix.

  5. investigating Oct 20, 2025, 04:12 PM UTC

    We are continuing to investigate this issue.

  6. monitoring Oct 20, 2025, 04:30 PM UTC

    We have made a hotfix update to Astro to relieve the Airflow UI slowness in Azure, GCP, and in AWS regions other than us-east-1. We are continuing to monitor the impact of the change, but early signs indicate that the speed of the UI should be improving. This update has no effect on the issues unique to deployments in AWS us-east-1.

  7. monitoring Oct 20, 2025, 06:19 PM UTC

    We have observed pods that were previously stuck in the pending state slowly getting scheduled on EC2 nodes following mitigations applied by the AWS team. This should start resolving issues with task execution. We are actively monitoring the situation.

  8. resolved Oct 20, 2025, 09:49 PM UTC

    All Astronomer components have returned to a healthy state.

Read the full incident report →

Minor October 6, 2025

Internal API intermittent outage causing slow UI loading, intermittent Airflow API issues

Detected by Pingoru
Oct 06, 2025, 04:08 PM UTC
Resolved
Oct 06, 2025, 10:08 PM UTC
Duration
5h 59m
Affected: Deployment AccessDeployment AccessDeployment ManagementDeployment ManagementCloud UICloud UICloud APICloud APICloud Image RepositoryCloud Image RepositoryCluster ManagementDashboards and Analytics
Timeline · 3 updates
  1. identified Oct 06, 2025, 04:08 PM UTC

    We are investigating this issue and have identified the source of the issue and are putting mitigations in place. Airflow deployment task execution is unaffected. Dag triggering via Airflow API may experience intermittent issues.

  2. monitoring Oct 06, 2025, 05:52 PM UTC

    The root cause of this incident has been fully identified, and our engineering teams are continuing to work on implementing mitigations. We are also continuing to monitor this issue and the underlying systems associated. Incidence of this issue has dropped significantly, however degraded performance is still possible.

  3. resolved Oct 06, 2025, 10:08 PM UTC

    This incident has been resolved.

Read the full incident report →

Major September 4, 2025

Some clusters experiencing unintentional ephemeral storage reduction for KPOs

Detected by Pingoru
Sep 04, 2025, 09:48 PM UTC
Resolved
Sep 04, 2025, 11:40 PM UTC
Duration
1h 51m
Affected: Scheduling and Running DAGs and Tasks
Timeline · 3 updates
  1. investigating Sep 04, 2025, 09:48 PM UTC

    We are currently investigating some hosted clusters experiencing an unintentional reduction in ephemeral storage for default kubernetes pods

  2. identified Sep 04, 2025, 10:17 PM UTC

    The issue has been identified and a fix is being implemented.

  3. resolved Sep 04, 2025, 11:40 PM UTC

    This incident has been resolved.

Read the full incident report →

Notice August 13, 2025

Docs for the self-hosted Astronomer Software product are down

Detected by Pingoru
Aug 13, 2025, 07:33 PM UTC
Resolved
Aug 13, 2025, 10:33 PM UTC
Duration
2h 59m
Timeline · 3 updates
  1. investigating Aug 13, 2025, 07:33 PM UTC

    We are currently migrating our docs to a new platform. All docs are working except the docs for our self-hosted platform generally referred to as Astronomer Software. You can find a working mirror here https://clear-mousepad.cloudvent.net/docs/software/

  2. identified Aug 13, 2025, 09:16 PM UTC

    The issue has been identified and a fix is being implemented.

  3. resolved Aug 13, 2025, 10:33 PM UTC

    This incident has been resolved.

Read the full incident report →

Major August 13, 2025

Issue with Live task logs with Runtime 3.0-7

Detected by Pingoru
Aug 13, 2025, 08:13 AM UTC
Resolved
Aug 19, 2025, 05:01 PM UTC
Duration
6d 8h
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 3 updates
  1. investigating Aug 13, 2025, 08:13 AM UTC

    We are aware of an issue affecting live task logs with Runtime 3.0-7 and are actively investigating the issue

  2. identified Aug 14, 2025, 08:47 AM UTC

    The issue has been identified and a fix is being implemented.

  3. resolved Aug 19, 2025, 05:01 PM UTC

    Astro Runtime 3.0-8 has been released, which contains the fix for this issue. Release notes are available here: https://www.astronomer.io/docs/astro/runtime-release-notes#astro-runtime-30-8

Read the full incident report →

Minor August 11, 2025

False Positive SLA Violation Alerts on Astro Observe

Detected by Pingoru
Aug 11, 2025, 03:10 PM UTC
Resolved
Aug 11, 2025, 05:45 PM UTC
Duration
2h 34m
Affected: Astro Observe
Timeline · 3 updates
  1. investigating Aug 11, 2025, 03:10 PM UTC

    We are seeing reports that users of Astro Observe are getting false positive alerts for Data Freshness SLAs. We are currently investigating the cause of these false positives.

  2. monitoring Aug 11, 2025, 03:22 PM UTC

    A fix has been implemented for the false positive alerts affecting Data Freshness SLAs in Astro Observe. Improvement has been observed, we'll continue monitoring to ensure full resolution.

  3. resolved Aug 11, 2025, 05:45 PM UTC

    The false positive alerts for Data Freshness SLAs in Astro Observe have been resolved.

Read the full incident report →

Major July 21, 2025

Customers using Azure-managed subscriptions may be unable to access the Astro UI.

Detected by Pingoru
Jul 21, 2025, 09:23 AM UTC
Resolved
Jul 21, 2025, 02:05 PM UTC
Duration
4h 41m
Affected: Deployment AccessDeployment ManagementCloud UICloud APIDashboards and Analytics
Timeline · 5 updates
  1. investigating Jul 21, 2025, 09:23 AM UTC

    We are actively investigating the issue.

  2. investigating Jul 21, 2025, 01:04 PM UTC

    We are continuing to investigate this issue.

  3. identified Jul 21, 2025, 01:04 PM UTC

    The issue has been identified and a fix is being implemented.

  4. monitoring Jul 21, 2025, 01:30 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Jul 21, 2025, 02:05 PM UTC

    This incident has been resolved.

Read the full incident report →

Major July 18, 2025

GCP us-east1 incident may affect Astro clusters in this region

Detected by Pingoru
Jul 18, 2025, 04:25 PM UTC
Resolved
Jul 18, 2025, 06:33 PM UTC
Duration
2h 8m
Affected: Scheduling and Running DAGs and Tasks
Timeline · 2 updates
  1. investigating Jul 18, 2025, 04:25 PM UTC

    Google Cloud has put up a status page indicating that several services in us-east1 are affected by an incident. Astro clusters in this region will be affected by this. Clusters in other regions and other clouds should not be affected, as none of the control plane components for Astro are hosted in this region. For more information, follow the GCP incident: https://status.cloud.google.com/incidents/8cY8jdUpEGGbsSMSQk7J

  2. resolved Jul 18, 2025, 06:33 PM UTC

    Per Google, this outage is now resolved.

Read the full incident report →

Major July 10, 2025

GitHub Integration Image Deploys Failing

Detected by Pingoru
Jul 10, 2025, 03:24 PM UTC
Resolved
Jul 10, 2025, 05:06 PM UTC
Duration
1h 42m
Affected: Deployment ManagementDeployment Management
Timeline · 2 updates
  1. investigating Jul 10, 2025, 03:24 PM UTC

    We’re currently investigating an issue where GitHub Integration image deploys are failing.

  2. resolved Jul 10, 2025, 05:06 PM UTC

    The issue affecting GitHub Integration image deploys has been resolved.

Read the full incident report →

Critical July 3, 2025

Some clusters are unable to start new KPO tasks

Detected by Pingoru
Jul 03, 2025, 11:19 PM UTC
Resolved
Jul 04, 2025, 02:10 AM UTC
Duration
2h 51m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 4 updates
  1. investigating Jul 03, 2025, 11:19 PM UTC

    Some clusters that were updated today will fail to run any KPO tasks

  2. identified Jul 03, 2025, 11:52 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Jul 04, 2025, 01:06 AM UTC

    A fix has been implemented. We are currently monitoring the results.

  4. resolved Jul 04, 2025, 02:10 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor July 1, 2025

Astro CLI 1.35 may unintentionally modify worker queue configurations

Detected by Pingoru
Jul 01, 2025, 01:59 PM UTC
Resolved
Jul 01, 2025, 06:25 PM UTC
Duration
4h 26m
Affected: Deployment ManagementDeployment Management
Timeline · 4 updates
  1. investigating Jul 01, 2025, 01:59 PM UTC

    Upgrading to Astro CLI version 1.35 can lead to unintended changes in your worker queues settings, particularly when deploying with modified or missing workerQueues definitions. Please use CLI version 1.34 or lower.

  2. identified Jul 01, 2025, 02:03 PM UTC

    We’ve identified the root cause of the issue in Astro CLI version 1.35 that results in unintentional modifications to worker queues. We have yanked Astro CLI 1.35. Yanked release: https://github.com/astronomer/astro-cli/releases/tag/v1.35.0

  3. identified Jul 01, 2025, 04:55 PM UTC

    Deploys using Astro CLI version 1.35 are now blocked to prevent unintentional changes to worker queues. Please use CLI version 1.34 or lower.

  4. resolved Jul 01, 2025, 06:25 PM UTC

    We’ve verified that all affected customers have been contacted regarding the issue with Astro CLI version 1.35.0.

Read the full incident report →

Major June 12, 2025

Astro clusters in GCP are having scaling issues due to a GCP outage

Detected by Pingoru
Jun 12, 2025, 06:49 PM UTC
Resolved
Jun 12, 2025, 11:23 PM UTC
Duration
4h 34m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 5 updates
  1. investigating Jun 12, 2025, 06:49 PM UTC

    There is an active GCP outage that is affecting Astro customers using GCP

  2. investigating Jun 12, 2025, 06:50 PM UTC

    https://status.cloud.google.com/ We will continue to monitor the issue and update this page

  3. identified Jun 12, 2025, 06:53 PM UTC

    At this time we believe that Deployments on Azure and AWS are unaffected. We are currently checking our components to be certain. We have seen task failure rates increase on Astro Deployments on GCP. We will pass along any updates we receive from Google regarding this issue.

  4. monitoring Jun 12, 2025, 09:49 PM UTC

    Updates from status.cloud.google.com indicate most if not all of the issues are no longer occurring. On Astro, we have seen our metrics return to normal. We tentatively believe that this problems affecting Astro have passed. We continue to monitor the situation.

  5. resolved Jun 12, 2025, 11:23 PM UTC

    This incident appears to be resolved as it pertains to Astro.

Read the full incident report →

Minor June 5, 2025

Creating a Connection can crash the browser tab

Detected by Pingoru
Jun 05, 2025, 11:48 PM UTC
Resolved
Jun 06, 2025, 03:18 PM UTC
Duration
15h 30m
Affected: Cloud UI
Timeline · 3 updates
  1. investigating Jun 05, 2025, 11:48 PM UTC

    If you create certain Connection types in the Environments menu, it can crash your browser tab. Currently it affects SSH, SMTP, SFTP, Postgres, and Generic. There could be others. We are investigating currently.

  2. identified Jun 06, 2025, 12:48 PM UTC

    The issue has been identified, and our team is actively working on a fix.

  3. resolved Jun 06, 2025, 03:18 PM UTC

    This issue has been resolved.

Read the full incident report →

Major May 20, 2025

403 Errors for Image Deploys

Detected by Pingoru
May 20, 2025, 04:02 PM UTC
Resolved
May 20, 2025, 05:25 PM UTC
Duration
1h 23m
Affected: Deployment ManagementCloud Image Repository
Timeline · 3 updates
  1. investigating May 20, 2025, 04:02 PM UTC

    A small subset of customers have reported 403 errors when running the astro deploy command to deploy a new image. We are actively investigating this issue. If you are experiencing these errors, we encourage you to contact support and include the login command you used, astro cli and docker versions, and any log messages.

  2. monitoring May 20, 2025, 04:52 PM UTC

    We've implemented a mitigation for this issue and the affected clusters should see successful image pushes. We will continue to monitor for additional errors.

  3. resolved May 20, 2025, 05:25 PM UTC

    We have determined that this error is caused by cached credentials which are no longer valid after an internal change in Astro to the image registry. The fix must be performed client-side (i.e. on the machine running `astro deploy`). If you experience this error, run `docker logout` for each Astro registry that this machine has cached credentials for. By default, credentials are stored in ~/.docker/config.json, and if you are using this default setting, the following bash script will identify cached credentials and run docker logout for those that correspond to Astro registries. for domain in $(grep 'registry.astronomer.run' ~/.docker/config.json | awk '{print $1}' | tr -d '":' | sort | uniq); do docker logout "$domain" done

Read the full incident report →

Major May 16, 2025

Identified a configuration issue affecting Runtime 9 which is affecting DAG execution on these deployments

Detected by Pingoru
May 16, 2025, 11:28 AM UTC
Resolved
May 16, 2025, 01:09 PM UTC
Duration
1h 41m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 3 updates
  1. investigating May 16, 2025, 11:28 AM UTC

    We are currently investigating the issue.

  2. identified May 16, 2025, 12:12 PM UTC

    Fix has been validated and is rolling out to affected deployments.

  3. resolved May 16, 2025, 01:09 PM UTC

    This incident has been resolved.

Read the full incident report →

Major April 18, 2025

Stuck worker pods resulting in tasks failing in the queued state

Detected by Pingoru
Apr 18, 2025, 02:25 PM UTC
Resolved
Apr 19, 2025, 06:12 AM UTC
Duration
15h 46m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 4 updates
  1. investigating Apr 18, 2025, 02:25 PM UTC

    In some deployments, worker pods are getting stuck in the initialization state for an extended period of time. Due to this, queued tasks are unable to run and fail. This is not affecting all deployments. We are investigating which deployments are affected and why.

  2. investigating Apr 18, 2025, 07:12 PM UTC

    We are continuing to investigate this issue.

  3. investigating Apr 18, 2025, 09:36 PM UTC

    The incident is resolved.

  4. resolved Apr 19, 2025, 06:12 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor April 7, 2025

Cost Breakdown Dashboard data update delayed

Detected by Pingoru
Apr 07, 2025, 01:42 PM UTC
Resolved
Apr 07, 2025, 07:18 PM UTC
Duration
5h 35m
Affected: Cloud UI
Timeline · 3 updates
  1. identified Apr 07, 2025, 01:42 PM UTC

    Data shown in the Organization Dashboards Cost Breakdown (for Enterprise customers) is delayed. As stated on the page itself, the latest data is as of April 4th. The processing to update this dashboard is currently ongoing, and we expect the data to be refreshed at approximately 16:00 UTC.

  2. identified Apr 07, 2025, 04:43 PM UTC

    Deployment cost is now up to date, but compute costs for some customers remain outdated. We were working with our billing vendor to determine the source of the issue.

  3. resolved Apr 07, 2025, 07:18 PM UTC

    This issue is now resolved except for one customer who we have contacted directly.

Read the full incident report →