Alation Cloud Service Outage History

Alation Cloud Service had 12 outages in the last 2 years totaling 83h 54m of downtime — averaging 0.5 incidents per month.

There were 12 Alation Cloud Service outages since June 4, 2025 totaling 83h 54m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.alationcloud.com

Minor April 24, 2026

Service Disruption Affecting Agent Interactions

Detected by Pingoru: Apr 24, 2026, 01:00 AM UTC
Resolved: Apr 25, 2026, 06:52 AM UTC
Duration: 1d 5h

Affected: Americas (US-east)Americas (US-west)Canada (Montreal)EMEA (Ireland)EMEA (Frankfurt)APAC (Sydney)APAC (Singapore)APAC (Tokyo)APAC (Mumbai)

Timeline · 4 updates

investigating Apr 24, 2026, 02:38 PM UTC

We recently experienced a service disruption that caused agent interactions to fail. The issue was traced to an expired token, which prevented a backend service from writing query results to storage. We have applied a temporary mitigation by recycling the affected tenant, which has restored normal functionality. Our team is actively working on a permanent fix to prevent this issue from recurring. Impact: This issue affected agent interactions only. All other platform functionality remained unaffected
identified Apr 24, 2026, 02:39 PM UTC

We have applied a temporary mitigation by recycling the affected tenant, which has restored normal functionality. Our team is actively working on a permanent fix to prevent this issue from recurring.
monitoring Apr 24, 2026, 05:47 PM UTC

The issue causing agent interaction failures has been resolved, and the agent system is now fully functional. We are actively monitoring system health to ensure continued stability.
resolved Apr 25, 2026, 06:52 AM UTC

This incident has been resolved.

Read the full incident report →

Notice April 23, 2026

Alation Service Degradation - Catalog editor

Detected by Pingoru: Apr 23, 2026, 03:57 PM UTC
Resolved: Apr 23, 2026, 03:57 PM UTC
Duration: —

Affected: Americas (US-east) - DevAmericas (US-east)Americas (US-west)Americas (US-west) - DevCanada (Montreal)Canada (Montreal) - DevEMEA (Ireland)EMEA (Ireland) - DevEMEA (Frankfurt) - DevEMEA (Frankfurt)APAC (Sydney)APAC (Sydney) - DevAPAC (Singapore) - DevAPAC (Singapore)APAC (Tokyo)APAC (Tokyo) - DevAPAC (Mumbai)APAC (Mumbai) - DevPoV (Proof of Value)

Timeline · 1 update

resolved Apr 23, 2026, 03:57 PM UTC

We discovered an issue where Rich Text Editor fields across the Catalog are not displaying content correctly. The issue has been resolved by rolling back the problematic deployment.

Read the full incident report →

Notice March 11, 2026

Alation service degradation - Alation agent

Detected by Pingoru: Mar 11, 2026, 11:47 PM UTC
Resolved: Mar 12, 2026, 06:23 AM UTC
Duration: 6h 35m

Affected: Americas (US-east)Americas (US-west)

Timeline · 2 updates

monitoring Mar 11, 2026, 11:47 PM UTC

A service interruption to Alation agent was encountered by some customers in the US-East-1 and US-West-2 regions. The service interruption has been remediated, and we are monitoring the status.
resolved Mar 12, 2026, 06:23 AM UTC

We have not seen the error reoccur in the last few hours; we are marking the incident as resolved.

Read the full incident report →

Major February 19, 2026

Alation Service Degradation

Detected by Pingoru: Feb 19, 2026, 11:01 AM UTC
Resolved: Feb 19, 2026, 08:42 PM UTC
Duration: 9h 40m

Affected: Americas (US-east)

Timeline · 8 updates

Read the full incident report →

Minor February 2, 2026

Degraded Service - Alation.

Detected by Pingoru: Feb 02, 2026, 01:01 PM UTC
Resolved: Feb 02, 2026, 04:51 PM UTC
Duration: 3h 49m

Affected: Americas (US-east)

Timeline · 5 updates

investigating Feb 02, 2026, 02:01 PM UTC

Alation service has recovered for most tenants and is operating normally. However, a limited number of tenants are still experiencing service disruption (login failures, timeouts, or degraded performance). Our engineering team is actively working with priority to restore service for the remaining affected tenants.
identified Feb 02, 2026, 02:18 PM UTC

Service has been restored for the majority of tenants. We have identified an issue affecting a small subset of tenants that are still experiencing errors and/or degraded performance. Targeted remediation is in progress to recover the remaining impacted tenants.
identified Feb 02, 2026, 03:01 PM UTC

Most tenants have recovered. A small subset of tenants still remains impacted; we’re continuing targeted remediation
monitoring Feb 02, 2026, 03:24 PM UTC

Service has been restored for the tenants that experienced failures. We are actively monitoring the infrastructure and application to validate expected behaviour.
resolved Feb 02, 2026, 04:51 PM UTC

Incident resolved. We’ll continue routine monitoring and will follow up if anything changes.”

Read the full incident report →

Notice October 21, 2025

Metadata extraction / QLI failures with BAD REQUEST HTTP response headers

Detected by Pingoru: Oct 21, 2025, 05:26 PM UTC
Resolved: Oct 21, 2025, 09:04 PM UTC
Duration: 3h 38m

Affected: Americas (US-east) - DevAmericas (US-east)

Timeline · 3 updates

investigating Oct 21, 2025, 05:26 PM UTC

We are currently investigating an issue with the MDE Pipeline service, which is preventing data extraction and causing errors. The error is related to a timeout connection to the pipeline service. Our team is working to resolve the issue as quickly as possible. We will keep you posted with the progress as it becomes available.
identified Oct 21, 2025, 06:35 PM UTC

Cause has been identified and fix implemented. Working on resolution.
resolved Oct 21, 2025, 09:04 PM UTC

Fix has been implemented and confirmed to successfully resolve the issue. Root cause was result of AWS US-East-1 outage from previous day (Monday, October 20).

Read the full incident report →

Minor October 20, 2025

Service Degradation for EU Customers

Detected by Pingoru: Oct 20, 2025, 09:45 AM UTC
Resolved: Oct 20, 2025, 02:25 PM UTC
Duration: 4h 39m

Affected: EMEA (Ireland)

Timeline · 3 updates

identified Oct 20, 2025, 02:18 PM UTC

We are investigating reports of degraded performance impacting customers in the EU region.
identified Oct 20, 2025, 02:20 PM UTC

A subset of EU customers may experience: Slower load times or timeouts when accessing the Alation application. Delays in query execution, search indexing, and accessing catalog services
resolved Oct 20, 2025, 02:25 PM UTC

The issue that was impacting customers in the EU region has been resolved, system performance is showing normal performance, and the services are now operating normally.

Read the full incident report →

Critical October 20, 2025

Third-party provider outage (AWS)

Detected by Pingoru: Oct 20, 2025, 07:40 AM UTC
Resolved: Oct 20, 2025, 10:00 AM UTC
Duration: 2h 19m

Affected: Americas (US-east) - DevAmericas (US-east)

Timeline · 4 updates

identified Oct 20, 2025, 08:07 AM UTC

We have detected elevated error rates and degraded performance across parts of the Alation platform. This is caused by a service disruption at AWS, which is affecting one or more of their core services that Alation depends on. Our own systems are healthy, but upstream instability is affecting service delivery for our users. Impact: Some users may experience slower response times, timeouts, or failures when using certain features (for example: data catalog search, ingestion jobs, API calls or dashboard refreshes). Data integrity is not impacted; no data loss or corruption has been detected. Queued operations will retry automatically once upstream services recover.
identified Oct 20, 2025, 09:52 AM UTC

AWS states that they are still working on finding the root cause and actively working on the issue.
monitoring Oct 20, 2025, 09:54 AM UTC

AWS further reports “significant signs of recovery”: most requests should now be succeeding, though some services still have latency and backlog to clear. We see early signs of Alation service recovery; we will keep you updated.
resolved Oct 20, 2025, 10:00 AM UTC

The underlying AWS service has recovered, and all Alation services have returned to normal operation for affected customers. Our teams will continue to monitor the environment to ensure continued stability.

Read the full incident report →

Notice July 16, 2025

Latency across multiple regions

Detected by Pingoru: Jul 16, 2025, 03:59 PM UTC
Resolved: Jul 16, 2025, 08:59 PM UTC
Duration: 5h

Timeline · 3 updates

investigating Jul 16, 2025, 03:59 PM UTC

Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
monitoring Jul 16, 2025, 06:25 PM UTC

We have identified and deployed a fix across all tenants and are monitoring performance.
resolved Jul 16, 2025, 08:59 PM UTC

This incident has been resolved.

Read the full incident report →

Notice July 15, 2025

Investigating latency across multiple regions

Detected by Pingoru: Jul 15, 2025, 04:27 PM UTC
Resolved: Jul 16, 2025, 05:56 AM UTC
Duration: 13h 28m

Timeline · 4 updates

investigating Jul 15, 2025, 04:27 PM UTC

We are currently investigating reports of latency across multiple regions. This is impacting availability and performance for some customers using our services. Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
monitoring Jul 15, 2025, 05:23 PM UTC

We have implemented a revised memory configuration, resulting in enhanced system stability and reduced latency. We will continue monitoring through the day and provide updates as appropriate.
monitoring Jul 15, 2025, 08:56 PM UTC

We are continuing to monitor environments for issues.
resolved Jul 16, 2025, 05:56 AM UTC

This incident has been resolved.

Read the full incident report →

Major July 14, 2025

Investigating connectivity issues

Detected by Pingoru: Jul 14, 2025, 04:53 PM UTC
Resolved: Jul 14, 2025, 06:32 PM UTC
Duration: 1h 38m

Timeline · 4 updates

investigating Jul 14, 2025, 04:53 PM UTC

We are currently experiencing a service disruption across all regions. This is impacting availability and performance for some customers using our services. Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
identified Jul 14, 2025, 05:47 PM UTC

The issue has been identified and a fix is being implemented.
monitoring Jul 14, 2025, 06:31 PM UTC

The problem has been resolved, and all of our applications are now working properly. We are constantly monitoring the system.
resolved Jul 14, 2025, 06:32 PM UTC

This incident has been resolved.

Read the full incident report →

Minor June 4, 2025

We are experiencing connectivity issue on the US Cluster

Detected by Pingoru: Jun 04, 2025, 08:20 AM UTC
Resolved: Jun 04, 2025, 11:31 AM UTC
Duration: 3h 11m

Affected: Americas (US-east)

Timeline · 4 updates

investigating Jun 04, 2025, 08:20 AM UTC

We are currently experiencing a service disruption in the us-east1 region. This is impacting availability and performance for some customers using our services hosted in this region. Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
identified Jun 04, 2025, 08:53 AM UTC

Our investigation revealed the issue originated from underlying infrastructure limitations. We have scaled up resources in the affected region and are monitoring for stability.
monitoring Jun 04, 2025, 08:56 AM UTC

All impacted customers are back online. Our team is actively monitoring the systems to ensure everything remains stable and performs as expected.
resolved Jun 04, 2025, 11:31 AM UTC

This incident has been resolved.

Read the full incident report →