Alation Cloud Service incident
Alation Cloud Service - DEV unavailable for some customers in US region
Alation Cloud Service experienced a minor incident on May 5, 2025 affecting Americas (US-east) - Dev and Americas (US-west) - Dev, lasting 3d 3h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating May 05, 2025, 07:56 AM UTC
We have discovered a problem with the US cluster's DEV tenants' sporadic availability, and we are currently investigating the cause.
- investigating May 05, 2025, 08:02 AM UTC
We are working on it, and the matter is still being investigated. We will update when we have further info.
- identified May 05, 2025, 10:01 AM UTC
We have identified the issue as being with the cache cluster, and we are working with our vendor to resolve the issue. We will update more information as it becomes available.
- identified May 05, 2025, 02:01 PM UTC
We have added additional resources to the cluster and are waiting on the rebalancing of the cluster.
- identified May 05, 2025, 02:12 PM UTC
We’re still seeing elevated connection activity within our application, which impacts the Dev tenants in the US cluster. Our team is actively investigating and working on a fix. We apologize for the inconvenience and appreciate your patience.
- identified May 05, 2025, 04:45 PM UTC
Our team is actively working on a hotfix to resolve the ongoing issue. We will update as soon as the fix is ready for release. We appreciate your understanding and patience.
- identified May 06, 2025, 01:05 PM UTC
Our team is actively working on a hotfix to address the issue and it's currently undergoing testing. We will share another update as soon as the hotfix has passed validation and is ready for deployment.
- identified May 07, 2025, 03:05 PM UTC
The hotfix has successfully passed initial validation and is currently undergoing performance testing. Our team is monitoring the results closely to ensure it meets all stability and reliability standards. We will provide another update once testing is complete and the hotfix is ready for deployment.
- identified May 07, 2025, 04:34 PM UTC
We are currently preparing for the patch rollout and ensuring all systems are ready for a smooth deployment. Our team is finalizing the necessary steps, and we will share another update once we are ready to begin the rollout
- identified May 07, 2025, 11:40 PM UTC
Minor release 2025.1.2.1 with a fix for the issue impacting Dev tenants has been released. Deployment is scheduled for 5:00 PM PDT.
- monitoring May 08, 2025, 05:17 AM UTC
We have initiated the deployment of the hotfix (version 2025.1.2.1) . This rollout is being conducted in phases to ensure stability across all systems. Our team is actively monitoring the process and will provide updates as it progresses.
- monitoring May 08, 2025, 05:17 AM UTC
The hotfix (version 2025.1.2.1) has been successfully rolled out. At this time, all systems are operating as expected. Our team will continue to monitor the infrastructure closely to ensure ongoing stability and performance. We’ll share further updates if needed.
- resolved May 08, 2025, 08:36 AM UTC
The issue has been fully resolved. The hotfix has been deployed successfully, and all systems are stable. We will continue to monitor as part of our standard operations. Thank you for your patience throughout this process.