- Detected by Pingoru
- Oct 30, 2025, 12:58 AM UTC
- Resolved
- Oct 30, 2025, 04:27 AM UTC
- Duration
- 3h 28m
Affected: Deployment ManagementDeployment Management
Timeline · 2 updates
-
investigating Oct 30, 2025, 12:58 AM UTC
https://www.astronomer.io/docs/astro/deployment-health-incidents
-
resolved Oct 30, 2025, 04:27 AM UTC
Deployment Health Incidents have been temporarily disabled.
Read the full incident report →
- Detected by Pingoru
- Oct 29, 2025, 11:57 PM UTC
- Resolved
- Oct 30, 2025, 12:58 AM UTC
- Duration
- 1h 1m
Affected: Deployment AccessDeployment AccessDeployment ManagementDeployment ManagementCloud UICloud UIAstro ObserveCloud APICloud APICloud Image RepositoryCloud Image RepositoryCluster ManagementDashboards and Analytics
Timeline · 4 updates
-
investigating Oct 29, 2025, 11:57 PM UTC
We are currently investigating this issue.
-
investigating Oct 30, 2025, 12:51 AM UTC
Deployment Health Incidents may not be working https://www.astronomer.io/docs/astro/deployment-health-incidents
-
monitoring Oct 30, 2025, 12:51 AM UTC
Deployment Health Incidents may not be working https://www.astronomer.io/docs/astro/deployment-health-incidents
-
resolved Oct 30, 2025, 12:58 AM UTC
This incident has been resolved.
Read the full incident report →
Critical October 29, 2025 - Detected by Pingoru
- Oct 29, 2025, 04:05 PM UTC
- Resolved
- Oct 30, 2025, 12:49 AM UTC
- Duration
- 8h 44m
Affected: Cloud UICloud UI
Timeline · 4 updates
-
investigating Oct 29, 2025, 04:05 PM UTC
We are currently investigating an apparent issue with Azure's Front Door CDN that is causing DNS timeouts and unavailability of the Astro Cloud UI. Airflow deployments are unaffected and continue to process tasks. We have updated our DNS routing to bypass Front Door for now and are seeing access to the Cloud UI being restored.
-
monitoring Oct 29, 2025, 04:05 PM UTC
A fix has been implemented and we are monitoring the results.
-
monitoring Oct 29, 2025, 04:35 PM UTC
We are continuing to monitor for any further affects from this Azure outage. You can follow Azure's status here: https://azure.status.microsoft/en-us/status
-
resolved Oct 30, 2025, 12:49 AM UTC
Resolved per Azure status; https://azure.status.microsoft/en-us/status
Read the full incident report →
- Detected by Pingoru
- Oct 28, 2025, 06:05 PM UTC
- Resolved
- Oct 28, 2025, 11:51 PM UTC
- Duration
- 5h 46m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 3 updates
-
investigating Oct 28, 2025, 06:05 PM UTC
We are currently investigating this issue and have been notified by AWS of issues with VM scale ups in AWS us-east-1, specifically the use1-az2 AZ. We are working to identify affected deployments, as well as on mitigations.
-
monitoring Oct 28, 2025, 07:39 PM UTC
Customer deployments should no longer be affected, we are monitoring to confirm resolution.
-
resolved Oct 28, 2025, 11:51 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Oct 20, 2025, 10:00 AM UTC
- Resolved
- Oct 20, 2025, 09:49 PM UTC
- Duration
- 11h 48m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and TasksCloud UICloud UICluster Management
Timeline · 8 updates
-
investigating Oct 20, 2025, 10:00 AM UTC
We are aware of an ongoing AWS outage in the US-EAST-1 (N. Virginia) region that is impacting multiple AWS services and related infrastructure components. Customers with Astro clusters and deployments hosted on AWS may experience degraded performance, failed task executions, or delays in accessing their environments. Our team is actively monitoring the situation and assessing the impact across affected deployments. For real-time updates from AWS, please refer to their Service Health Dashboard - https://health.aws.amazon.com/health/status Next update will be provided as more information becomes available.
-
investigating Oct 20, 2025, 10:00 AM UTC
We are continuing to investigate this issue.
-
investigating Oct 20, 2025, 01:08 PM UTC
We’re aware that the Airflow UI has been running very slowly following the recent AWS outage. Our team is actively investigating the issue.
-
investigating Oct 20, 2025, 02:27 PM UTC
The AWS outage is affecting an internal tool, which is causing Airflow UI slowness in clusters running on all clouds (not just AWS). Our development team is working on a fix.
-
investigating Oct 20, 2025, 04:12 PM UTC
We are continuing to investigate this issue.
-
monitoring Oct 20, 2025, 04:30 PM UTC
We have made a hotfix update to Astro to relieve the Airflow UI slowness in Azure, GCP, and in AWS regions other than us-east-1. We are continuing to monitor the impact of the change, but early signs indicate that the speed of the UI should be improving. This update has no effect on the issues unique to deployments in AWS us-east-1.
-
monitoring Oct 20, 2025, 06:19 PM UTC
We have observed pods that were previously stuck in the pending state slowly getting scheduled on EC2 nodes following mitigations applied by the AWS team. This should start resolving issues with task execution. We are actively monitoring the situation.
-
resolved Oct 20, 2025, 09:49 PM UTC
All Astronomer components have returned to a healthy state.
Read the full incident report →
- Detected by Pingoru
- Oct 06, 2025, 04:08 PM UTC
- Resolved
- Oct 06, 2025, 10:08 PM UTC
- Duration
- 5h 59m
Affected: Deployment AccessDeployment AccessDeployment ManagementDeployment ManagementCloud UICloud UICloud APICloud APICloud Image RepositoryCloud Image RepositoryCluster ManagementDashboards and Analytics
Timeline · 3 updates
-
identified Oct 06, 2025, 04:08 PM UTC
We are investigating this issue and have identified the source of the issue and are putting mitigations in place. Airflow deployment task execution is unaffected. Dag triggering via Airflow API may experience intermittent issues.
-
monitoring Oct 06, 2025, 05:52 PM UTC
The root cause of this incident has been fully identified, and our engineering teams are continuing to work on implementing mitigations. We are also continuing to monitor this issue and the underlying systems associated. Incidence of this issue has dropped significantly, however degraded performance is still possible.
-
resolved Oct 06, 2025, 10:08 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Oct 01, 2025, 05:16 PM UTC
- Resolved
- Oct 01, 2025, 10:17 PM UTC
- Duration
- 5h
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and TasksDeployment AccessDeployment Access
Timeline · 4 updates
Read the full incident report →
- Detected by Pingoru
- Sep 04, 2025, 09:48 PM UTC
- Resolved
- Sep 04, 2025, 11:40 PM UTC
- Duration
- 1h 51m
Affected: Scheduling and Running DAGs and Tasks
Timeline · 3 updates
-
investigating Sep 04, 2025, 09:48 PM UTC
We are currently investigating some hosted clusters experiencing an unintentional reduction in ephemeral storage for default kubernetes pods
-
identified Sep 04, 2025, 10:17 PM UTC
The issue has been identified and a fix is being implemented.
-
resolved Sep 04, 2025, 11:40 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Aug 13, 2025, 07:33 PM UTC
- Resolved
- Aug 13, 2025, 10:33 PM UTC
- Duration
- 2h 59m
Timeline · 3 updates
-
investigating Aug 13, 2025, 07:33 PM UTC
We are currently migrating our docs to a new platform. All docs are working except the docs for our self-hosted platform generally referred to as Astronomer Software. You can find a working mirror here https://clear-mousepad.cloudvent.net/docs/software/
-
identified Aug 13, 2025, 09:16 PM UTC
The issue has been identified and a fix is being implemented.
-
resolved Aug 13, 2025, 10:33 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Aug 13, 2025, 08:13 AM UTC
- Resolved
- Aug 19, 2025, 05:01 PM UTC
- Duration
- 6d 8h
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 3 updates
-
investigating Aug 13, 2025, 08:13 AM UTC
We are aware of an issue affecting live task logs with Runtime 3.0-7 and are actively investigating the issue
-
identified Aug 14, 2025, 08:47 AM UTC
The issue has been identified and a fix is being implemented.
-
resolved Aug 19, 2025, 05:01 PM UTC
Astro Runtime 3.0-8 has been released, which contains the fix for this issue. Release notes are available here: https://www.astronomer.io/docs/astro/runtime-release-notes#astro-runtime-30-8
Read the full incident report →
- Detected by Pingoru
- Aug 11, 2025, 03:10 PM UTC
- Resolved
- Aug 11, 2025, 05:45 PM UTC
- Duration
- 2h 34m
Affected: Astro Observe
Timeline · 3 updates
-
investigating Aug 11, 2025, 03:10 PM UTC
We are seeing reports that users of Astro Observe are getting false positive alerts for Data Freshness SLAs. We are currently investigating the cause of these false positives.
-
monitoring Aug 11, 2025, 03:22 PM UTC
A fix has been implemented for the false positive alerts affecting Data Freshness SLAs in Astro Observe. Improvement has been observed, we'll continue monitoring to ensure full resolution.
-
resolved Aug 11, 2025, 05:45 PM UTC
The false positive alerts for Data Freshness SLAs in Astro Observe have been resolved.
Read the full incident report →
- Detected by Pingoru
- Jul 21, 2025, 09:23 AM UTC
- Resolved
- Jul 21, 2025, 02:05 PM UTC
- Duration
- 4h 41m
Affected: Deployment AccessDeployment ManagementCloud UICloud APIDashboards and Analytics
Timeline · 5 updates
-
investigating Jul 21, 2025, 09:23 AM UTC
We are actively investigating the issue.
-
investigating Jul 21, 2025, 01:04 PM UTC
We are continuing to investigate this issue.
-
identified Jul 21, 2025, 01:04 PM UTC
The issue has been identified and a fix is being implemented.
-
monitoring Jul 21, 2025, 01:30 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Jul 21, 2025, 02:05 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jul 18, 2025, 04:25 PM UTC
- Resolved
- Jul 18, 2025, 06:33 PM UTC
- Duration
- 2h 8m
Affected: Scheduling and Running DAGs and Tasks
Timeline · 2 updates
-
investigating Jul 18, 2025, 04:25 PM UTC
Google Cloud has put up a status page indicating that several services in us-east1 are affected by an incident. Astro clusters in this region will be affected by this. Clusters in other regions and other clouds should not be affected, as none of the control plane components for Astro are hosted in this region. For more information, follow the GCP incident: https://status.cloud.google.com/incidents/8cY8jdUpEGGbsSMSQk7J
-
resolved Jul 18, 2025, 06:33 PM UTC
Per Google, this outage is now resolved.
Read the full incident report →
- Detected by Pingoru
- Jul 10, 2025, 03:24 PM UTC
- Resolved
- Jul 10, 2025, 05:06 PM UTC
- Duration
- 1h 42m
Affected: Deployment ManagementDeployment Management
Timeline · 2 updates
-
investigating Jul 10, 2025, 03:24 PM UTC
We’re currently investigating an issue where GitHub Integration image deploys are failing.
-
resolved Jul 10, 2025, 05:06 PM UTC
The issue affecting GitHub Integration image deploys has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jul 03, 2025, 11:19 PM UTC
- Resolved
- Jul 04, 2025, 02:10 AM UTC
- Duration
- 2h 51m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 4 updates
-
investigating Jul 03, 2025, 11:19 PM UTC
Some clusters that were updated today will fail to run any KPO tasks
-
identified Jul 03, 2025, 11:52 PM UTC
The issue has been identified and a fix is being implemented.
-
monitoring Jul 04, 2025, 01:06 AM UTC
A fix has been implemented. We are currently monitoring the results.
-
resolved Jul 04, 2025, 02:10 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jul 01, 2025, 01:59 PM UTC
- Resolved
- Jul 01, 2025, 06:25 PM UTC
- Duration
- 4h 26m
Affected: Deployment ManagementDeployment Management
Timeline · 4 updates
-
investigating Jul 01, 2025, 01:59 PM UTC
Upgrading to Astro CLI version 1.35 can lead to unintended changes in your worker queues settings, particularly when deploying with modified or missing workerQueues definitions. Please use CLI version 1.34 or lower.
-
identified Jul 01, 2025, 02:03 PM UTC
We’ve identified the root cause of the issue in Astro CLI version 1.35 that results in unintentional modifications to worker queues. We have yanked Astro CLI 1.35. Yanked release: https://github.com/astronomer/astro-cli/releases/tag/v1.35.0
-
identified Jul 01, 2025, 04:55 PM UTC
Deploys using Astro CLI version 1.35 are now blocked to prevent unintentional changes to worker queues. Please use CLI version 1.34 or lower.
-
resolved Jul 01, 2025, 06:25 PM UTC
We’ve verified that all affected customers have been contacted regarding the issue with Astro CLI version 1.35.0.
Read the full incident report →
- Detected by Pingoru
- Jun 12, 2025, 06:49 PM UTC
- Resolved
- Jun 12, 2025, 11:23 PM UTC
- Duration
- 4h 34m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 5 updates
-
investigating Jun 12, 2025, 06:49 PM UTC
There is an active GCP outage that is affecting Astro customers using GCP
-
investigating Jun 12, 2025, 06:50 PM UTC
https://status.cloud.google.com/ We will continue to monitor the issue and update this page
-
identified Jun 12, 2025, 06:53 PM UTC
At this time we believe that Deployments on Azure and AWS are unaffected. We are currently checking our components to be certain. We have seen task failure rates increase on Astro Deployments on GCP. We will pass along any updates we receive from Google regarding this issue.
-
monitoring Jun 12, 2025, 09:49 PM UTC
Updates from status.cloud.google.com indicate most if not all of the issues are no longer occurring. On Astro, we have seen our metrics return to normal. We tentatively believe that this problems affecting Astro have passed. We continue to monitor the situation.
-
resolved Jun 12, 2025, 11:23 PM UTC
This incident appears to be resolved as it pertains to Astro.
Read the full incident report →
- Detected by Pingoru
- Jun 05, 2025, 11:48 PM UTC
- Resolved
- Jun 06, 2025, 03:18 PM UTC
- Duration
- 15h 30m
Affected: Cloud UI
Timeline · 3 updates
-
investigating Jun 05, 2025, 11:48 PM UTC
If you create certain Connection types in the Environments menu, it can crash your browser tab. Currently it affects SSH, SMTP, SFTP, Postgres, and Generic. There could be others. We are investigating currently.
-
identified Jun 06, 2025, 12:48 PM UTC
The issue has been identified, and our team is actively working on a fix.
-
resolved Jun 06, 2025, 03:18 PM UTC
This issue has been resolved.
Read the full incident report →
- Detected by Pingoru
- May 20, 2025, 04:02 PM UTC
- Resolved
- May 20, 2025, 05:25 PM UTC
- Duration
- 1h 23m
Affected: Deployment ManagementCloud Image Repository
Timeline · 3 updates
-
investigating May 20, 2025, 04:02 PM UTC
A small subset of customers have reported 403 errors when running the astro deploy command to deploy a new image. We are actively investigating this issue. If you are experiencing these errors, we encourage you to contact support and include the login command you used, astro cli and docker versions, and any log messages.
-
monitoring May 20, 2025, 04:52 PM UTC
We've implemented a mitigation for this issue and the affected clusters should see successful image pushes. We will continue to monitor for additional errors.
-
resolved May 20, 2025, 05:25 PM UTC
We have determined that this error is caused by cached credentials which are no longer valid after an internal change in Astro to the image registry. The fix must be performed client-side (i.e. on the machine running `astro deploy`). If you experience this error, run `docker logout` for each Astro registry that this machine has cached credentials for. By default, credentials are stored in ~/.docker/config.json, and if you are using this default setting, the following bash script will identify cached credentials and run docker logout for those that correspond to Astro registries. for domain in $(grep 'registry.astronomer.run' ~/.docker/config.json | awk '{print $1}' | tr -d '":' | sort | uniq); do docker logout "$domain" done
Read the full incident report →
- Detected by Pingoru
- May 16, 2025, 11:28 AM UTC
- Resolved
- May 16, 2025, 01:09 PM UTC
- Duration
- 1h 41m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 3 updates
-
investigating May 16, 2025, 11:28 AM UTC
We are currently investigating the issue.
-
identified May 16, 2025, 12:12 PM UTC
Fix has been validated and is rolling out to affected deployments.
-
resolved May 16, 2025, 01:09 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Apr 18, 2025, 02:25 PM UTC
- Resolved
- Apr 19, 2025, 06:12 AM UTC
- Duration
- 15h 46m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and Tasks
Timeline · 4 updates
-
investigating Apr 18, 2025, 02:25 PM UTC
In some deployments, worker pods are getting stuck in the initialization state for an extended period of time. Due to this, queued tasks are unable to run and fail. This is not affecting all deployments. We are investigating which deployments are affected and why.
-
investigating Apr 18, 2025, 07:12 PM UTC
We are continuing to investigate this issue.
-
investigating Apr 18, 2025, 09:36 PM UTC
The incident is resolved.
-
resolved Apr 19, 2025, 06:12 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Apr 07, 2025, 01:42 PM UTC
- Resolved
- Apr 07, 2025, 07:18 PM UTC
- Duration
- 5h 35m
Affected: Cloud UI
Timeline · 3 updates
-
identified Apr 07, 2025, 01:42 PM UTC
Data shown in the Organization Dashboards Cost Breakdown (for Enterprise customers) is delayed. As stated on the page itself, the latest data is as of April 4th. The processing to update this dashboard is currently ongoing, and we expect the data to be refreshed at approximately 16:00 UTC.
-
identified Apr 07, 2025, 04:43 PM UTC
Deployment cost is now up to date, but compute costs for some customers remain outdated. We were working with our billing vendor to determine the source of the issue.
-
resolved Apr 07, 2025, 07:18 PM UTC
This issue is now resolved except for one customer who we have contacted directly.
Read the full incident report →
- Detected by Pingoru
- Mar 26, 2025, 04:59 AM UTC
- Resolved
- Mar 26, 2025, 07:42 AM UTC
- Duration
- 2h 42m
Affected: Scheduling and Running DAGs and TasksScheduling and Running DAGs and TasksCloud UICloud UICloud APICloud API
Timeline · 4 updates
Read the full incident report →