- Detected by Pingoru
- May 20, 2026, 03:38 PM UTC
- Resolved
- May 20, 2026, 07:04 PM UTC
- Duration
- 3h 26m
Affected: Americas (US-east) - DevAmericas (US-east)
Timeline · 4 updates
-
investigating May 20, 2026, 03:38 PM UTC
We are investigating elevated latency and intermittent errors affecting the Data Products and Alation AI. Other Alation functionality is unaffected. Mitigations are in progress and our team is actively working to identify the root cause.
-
identified May 20, 2026, 04:07 PM UTC
We have identified the root cause of the elevated latency affecting Data Products and Alation AI. The team is working on mitigation.
-
monitoring May 20, 2026, 05:25 PM UTC
The underlying issue has been mitigated. Data Products and Alation AI are returning to normal operation. We are continuing to monitor service health to confirm full recovery.
-
resolved May 20, 2026, 07:04 PM UTC
This incident has been resolved. Data Products and Alation AI are operating normally. A detailed RCA will be shared shortly.
Read the full incident report →
- Detected by Pingoru
- Apr 24, 2026, 01:00 AM UTC
- Resolved
- Apr 25, 2026, 06:52 AM UTC
- Duration
- 1d 5h
Affected: Americas (US-east)Americas (US-west)Canada (Montreal)EMEA (Ireland)EMEA (Frankfurt)APAC (Sydney)APAC (Singapore)APAC (Tokyo)APAC (Mumbai)
Timeline · 4 updates
-
investigating Apr 24, 2026, 02:38 PM UTC
We recently experienced a service disruption that caused agent interactions to fail. The issue was traced to an expired token, which prevented a backend service from writing query results to storage. We have applied a temporary mitigation by recycling the affected tenant, which has restored normal functionality. Our team is actively working on a permanent fix to prevent this issue from recurring. Impact: This issue affected agent interactions only. All other platform functionality remained unaffected
-
identified Apr 24, 2026, 02:39 PM UTC
We have applied a temporary mitigation by recycling the affected tenant, which has restored normal functionality. Our team is actively working on a permanent fix to prevent this issue from recurring.
-
monitoring Apr 24, 2026, 05:47 PM UTC
The issue causing agent interaction failures has been resolved, and the agent system is now fully functional. We are actively monitoring system health to ensure continued stability.
-
resolved Apr 25, 2026, 06:52 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Apr 23, 2026, 03:57 PM UTC
- Resolved
- Apr 23, 2026, 03:57 PM UTC
- Duration
- —
Affected: Americas (US-east) - DevAmericas (US-east)Americas (US-west)Americas (US-west) - DevCanada (Montreal)Canada (Montreal) - DevEMEA (Ireland)EMEA (Ireland) - DevEMEA (Frankfurt) - DevEMEA (Frankfurt)APAC (Sydney)APAC (Sydney) - DevAPAC (Singapore) - DevAPAC (Singapore)APAC (Tokyo)APAC (Tokyo) - DevAPAC (Mumbai)APAC (Mumbai) - DevPoV (Proof of Value)
Timeline · 1 update
-
resolved Apr 23, 2026, 03:57 PM UTC
We discovered an issue where Rich Text Editor fields across the Catalog are not displaying content correctly. The issue has been resolved by rolling back the problematic deployment.
Read the full incident report →
- Detected by Pingoru
- Mar 11, 2026, 11:47 PM UTC
- Resolved
- Mar 12, 2026, 06:23 AM UTC
- Duration
- 6h 35m
Affected: Americas (US-east)Americas (US-west)
Timeline · 2 updates
-
monitoring Mar 11, 2026, 11:47 PM UTC
A service interruption to Alation agent was encountered by some customers in the US-East-1 and US-West-2 regions. The service interruption has been remediated, and we are monitoring the status.
-
resolved Mar 12, 2026, 06:23 AM UTC
We have not seen the error reoccur in the last few hours; we are marking the incident as resolved.
Read the full incident report →
- Detected by Pingoru
- Feb 19, 2026, 11:01 AM UTC
- Resolved
- Feb 19, 2026, 08:42 PM UTC
- Duration
- 9h 40m
Affected: Americas (US-east)
Timeline · 8 updates
Read the full incident report →
- Detected by Pingoru
- Feb 02, 2026, 01:01 PM UTC
- Resolved
- Feb 02, 2026, 04:51 PM UTC
- Duration
- 3h 49m
Affected: Americas (US-east)
Timeline · 5 updates
-
investigating Feb 02, 2026, 02:01 PM UTC
Alation service has recovered for most tenants and is operating normally. However, a limited number of tenants are still experiencing service disruption (login failures, timeouts, or degraded performance). Our engineering team is actively working with priority to restore service for the remaining affected tenants.
-
identified Feb 02, 2026, 02:18 PM UTC
Service has been restored for the majority of tenants. We have identified an issue affecting a small subset of tenants that are still experiencing errors and/or degraded performance. Targeted remediation is in progress to recover the remaining impacted tenants.
-
identified Feb 02, 2026, 03:01 PM UTC
Most tenants have recovered. A small subset of tenants still remains impacted; we’re continuing targeted remediation
-
monitoring Feb 02, 2026, 03:24 PM UTC
Service has been restored for the tenants that experienced failures. We are actively monitoring the infrastructure and application to validate expected behaviour.
-
resolved Feb 02, 2026, 04:51 PM UTC
Incident resolved. We’ll continue routine monitoring and will follow up if anything changes.”
Read the full incident report →
- Detected by Pingoru
- Oct 21, 2025, 05:26 PM UTC
- Resolved
- Oct 21, 2025, 09:04 PM UTC
- Duration
- 3h 38m
Affected: Americas (US-east) - DevAmericas (US-east)
Timeline · 3 updates
-
investigating Oct 21, 2025, 05:26 PM UTC
We are currently investigating an issue with the MDE Pipeline service, which is preventing data extraction and causing errors. The error is related to a timeout connection to the pipeline service. Our team is working to resolve the issue as quickly as possible. We will keep you posted with the progress as it becomes available.
-
identified Oct 21, 2025, 06:35 PM UTC
Cause has been identified and fix implemented. Working on resolution.
-
resolved Oct 21, 2025, 09:04 PM UTC
Fix has been implemented and confirmed to successfully resolve the issue. Root cause was result of AWS US-East-1 outage from previous day (Monday, October 20).
Read the full incident report →
- Detected by Pingoru
- Oct 20, 2025, 09:45 AM UTC
- Resolved
- Oct 20, 2025, 02:25 PM UTC
- Duration
- 4h 39m
Affected: EMEA (Ireland)
Timeline · 3 updates
-
identified Oct 20, 2025, 02:18 PM UTC
We are investigating reports of degraded performance impacting customers in the EU region.
-
identified Oct 20, 2025, 02:20 PM UTC
A subset of EU customers may experience: Slower load times or timeouts when accessing the Alation application. Delays in query execution, search indexing, and accessing catalog services
-
resolved Oct 20, 2025, 02:25 PM UTC
The issue that was impacting customers in the EU region has been resolved, system performance is showing normal performance, and the services are now operating normally.
Read the full incident report →
Critical October 20, 2025 - Detected by Pingoru
- Oct 20, 2025, 07:40 AM UTC
- Resolved
- Oct 20, 2025, 10:00 AM UTC
- Duration
- 2h 19m
Affected: Americas (US-east) - DevAmericas (US-east)
Timeline · 4 updates
-
identified Oct 20, 2025, 08:07 AM UTC
We have detected elevated error rates and degraded performance across parts of the Alation platform. This is caused by a service disruption at AWS, which is affecting one or more of their core services that Alation depends on. Our own systems are healthy, but upstream instability is affecting service delivery for our users. Impact: Some users may experience slower response times, timeouts, or failures when using certain features (for example: data catalog search, ingestion jobs, API calls or dashboard refreshes). Data integrity is not impacted; no data loss or corruption has been detected. Queued operations will retry automatically once upstream services recover.
-
identified Oct 20, 2025, 09:52 AM UTC
AWS states that they are still working on finding the root cause and actively working on the issue.
-
monitoring Oct 20, 2025, 09:54 AM UTC
AWS further reports “significant signs of recovery”: most requests should now be succeeding, though some services still have latency and backlog to clear. We see early signs of Alation service recovery; we will keep you updated.
-
resolved Oct 20, 2025, 10:00 AM UTC
The underlying AWS service has recovered, and all Alation services have returned to normal operation for affected customers. Our teams will continue to monitor the environment to ensure continued stability.
Read the full incident report →
- Detected by Pingoru
- Jul 16, 2025, 03:59 PM UTC
- Resolved
- Jul 16, 2025, 08:59 PM UTC
- Duration
- 5h
Affected: Americas (US-east) - DevAmericas (US-east)Americas (US-west)Americas (US-west) - DevCanada (Montreal)Canada (Montreal) - DevEMEA (Ireland)EMEA (Ireland) - DevEMEA (Frankfurt) - DevEMEA (Frankfurt)APAC (Sydney)APAC (Sydney) - DevAPAC (Singapore) - DevAPAC (Singapore)APAC (Tokyo)APAC (Tokyo) - DevAPAC (Mumbai)APAC (Mumbai) - DevPoV (Proof of Value)
Timeline · 3 updates
-
investigating Jul 16, 2025, 03:59 PM UTC
Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
-
monitoring Jul 16, 2025, 06:25 PM UTC
We have identified and deployed a fix across all tenants and are monitoring performance.
-
resolved Jul 16, 2025, 08:59 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jul 15, 2025, 04:27 PM UTC
- Resolved
- Jul 16, 2025, 05:56 AM UTC
- Duration
- 13h 28m
Affected: Americas (US-east) - DevAmericas (US-east)Americas (US-west)Americas (US-west) - DevCanada (Montreal)Canada (Montreal) - DevEMEA (Ireland)EMEA (Ireland) - DevEMEA (Frankfurt) - DevEMEA (Frankfurt)APAC (Sydney)APAC (Sydney) - DevAPAC (Singapore) - DevAPAC (Singapore)APAC (Tokyo)APAC (Tokyo) - DevAPAC (Mumbai)APAC (Mumbai) - DevPoV (Proof of Value)
Timeline · 4 updates
-
investigating Jul 15, 2025, 04:27 PM UTC
We are currently investigating reports of latency across multiple regions. This is impacting availability and performance for some customers using our services. Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
-
monitoring Jul 15, 2025, 05:23 PM UTC
We have implemented a revised memory configuration, resulting in enhanced system stability and reduced latency. We will continue monitoring through the day and provide updates as appropriate.
-
monitoring Jul 15, 2025, 08:56 PM UTC
We are continuing to monitor environments for issues.
-
resolved Jul 16, 2025, 05:56 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jul 14, 2025, 04:53 PM UTC
- Resolved
- Jul 14, 2025, 06:32 PM UTC
- Duration
- 1h 38m
Affected: Americas (US-east) - DevAmericas (US-east)Americas (US-west)Americas (US-west) - DevCanada (Montreal)Canada (Montreal) - DevEMEA (Ireland)EMEA (Ireland) - DevEMEA (Frankfurt) - DevEMEA (Frankfurt)APAC (Sydney)APAC (Sydney) - DevAPAC (Singapore) - DevAPAC (Singapore)APAC (Tokyo)APAC (Tokyo) - DevAPAC (Mumbai)APAC (Mumbai) - DevPoV (Proof of Value)
Timeline · 4 updates
-
investigating Jul 14, 2025, 04:53 PM UTC
We are currently experiencing a service disruption across all regions. This is impacting availability and performance for some customers using our services. Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
-
identified Jul 14, 2025, 05:47 PM UTC
The issue has been identified and a fix is being implemented.
-
monitoring Jul 14, 2025, 06:31 PM UTC
The problem has been resolved, and all of our applications are now working properly. We are constantly monitoring the system.
-
resolved Jul 14, 2025, 06:32 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jun 04, 2025, 08:20 AM UTC
- Resolved
- Jun 04, 2025, 11:31 AM UTC
- Duration
- 3h 11m
Affected: Americas (US-east)
Timeline · 4 updates
-
investigating Jun 04, 2025, 08:20 AM UTC
We are currently experiencing a service disruption in the us-east1 region. This is impacting availability and performance for some customers using our services hosted in this region. Our engineering teams are working to identify the issue and are actively working to mitigate the impact. We will provide updates here every 30 minutes or as new information becomes available.
-
identified Jun 04, 2025, 08:53 AM UTC
Our investigation revealed the issue originated from underlying infrastructure limitations. We have scaled up resources in the affected region and are monitoring for stability.
-
monitoring Jun 04, 2025, 08:56 AM UTC
All impacted customers are back online. Our team is actively monitoring the systems to ensure everything remains stable and performs as expected.
-
resolved Jun 04, 2025, 11:31 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- May 05, 2025, 05:30 AM UTC
- Resolved
- May 08, 2025, 08:36 AM UTC
- Duration
- 3d 3h
Affected: Americas (US-east) - DevAmericas (US-west) - Dev
Timeline · 13 updates
Read the full incident report →
- Detected by Pingoru
- Nov 27, 2024, 01:00 AM UTC
- Resolved
- Nov 27, 2024, 01:45 AM UTC
- Duration
- 45m
Affected: APAC (Sydney)
Timeline · 3 updates
-
investigating Nov 27, 2024, 01:00 AM UTC
ACS is unavailable for some customers in Sydney region. The issue has been identified and engineers are working on remediation.
-
monitoring Nov 27, 2024, 01:23 AM UTC
Issue has been identified and a fix has been implemented. We are monitoring the results.
-
resolved Nov 27, 2024, 05:51 AM UTC
The incident that was affecting the ACS service has been resolved.
Read the full incident report →
- Detected by Pingoru
- Oct 31, 2024, 10:26 PM UTC
- Resolved
- Nov 02, 2024, 07:18 AM UTC
- Duration
- 1d 8h
Affected: Americas (US-west)
Timeline · 2 updates
-
investigating Oct 31, 2024, 10:26 PM UTC
We are currently investigating an issue which impacts long running QLI/MDE jobs in the ACS US-WEST.
-
resolved Nov 02, 2024, 07:18 AM UTC
This incident has been resolved. Infrastructure maintenance has been completed.
Read the full incident report →
- Detected by Pingoru
- Oct 29, 2024, 11:30 PM UTC
- Resolved
- Oct 31, 2024, 06:38 PM UTC
- Duration
- 1d 19h
Affected: Americas (US-east)Americas (US-west)Canada (Montreal)EMEA (Ireland)EMEA (Frankfurt)APAC (Sydney)APAC (Singapore)APAC (Tokyo)
Timeline · 4 updates
-
investigating Oct 30, 2024, 06:39 PM UTC
We are currently investigating an issue with the MDE Pipeline service, which is preventing data extraction and causing errors. The error is related to a timeout connection to the pipeline service. Our team is working to resolve the issue as quickly as possible. We will keep you posted with the progress as it becomes available.
-
investigating Oct 30, 2024, 08:17 PM UTC
The issue is impacting US-east region only. All other regions are fully operational. Following error message may be seen in impacted region. "HTTPConnectionPool(host='airflow-pipeline-service.default.svc.cluster.local', port=80): Read timed out. (read timeout=1800) "
-
monitoring Oct 31, 2024, 03:40 AM UTC
Our engineering team has successfully resolved the issue causing the timeout connection to the Airflow pipeline service, and the system is now functioning as expected.
-
resolved Oct 31, 2024, 06:38 PM UTC
The incident had been resolved and we have not seen the error reoccur during our monitoring period.
Read the full incident report →
Critical October 28, 2024 - Detected by Pingoru
- Oct 28, 2024, 11:56 AM UTC
- Resolved
- Oct 28, 2024, 01:15 PM UTC
- Duration
- 1h 19m
Affected: Americas (US-west)Americas (US-west) - Dev
Timeline · 4 updates
-
investigating Oct 28, 2024, 11:56 AM UTC
Connectivity issue affecting the US-West-2 region. We are working to resolve it as quickly as possible and will provide updates shortly.
-
identified Oct 28, 2024, 12:08 PM UTC
The issue has been identified and a fix is being implemented.
-
monitoring Oct 28, 2024, 12:26 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Oct 28, 2024, 01:15 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Sep 09, 2024, 04:54 AM UTC
- Resolved
- Sep 09, 2024, 06:12 AM UTC
- Duration
- 1h 17m
Affected: EMEA (Ireland)EMEA (Frankfurt)APAC (Sydney)APAC (Singapore)APAC (Tokyo)
Timeline · 3 updates
-
investigating Sep 09, 2024, 04:54 AM UTC
We are currently experiencing issues with our scheduled jobs, affecting some of our customers. Our team is actively investigating the cause and we will keep you update with the progress.
-
identified Sep 09, 2024, 05:42 AM UTC
The issue causing elevated errors with the scheduled jobs has been identified, and our internal team is actively working on a fix.
-
resolved Sep 09, 2024, 06:12 AM UTC
Our team has successfully identified and rectified the root cause of the issue affecting scheduled jobs, and is now actively monitoring the situation to prevent any future occurrences. Specifically, we have determined that the issue only impacted customers in the APAC (Sydney) region.
Read the full incident report →
- Detected by Pingoru
- Jul 31, 2024, 12:06 PM UTC
- Resolved
- Jul 31, 2024, 04:08 PM UTC
- Duration
- 4h 1m
Affected: PoV (Proof of Value)
Timeline · 4 updates
-
identified Jul 31, 2024, 02:06 PM UTC
Users may encounter intermittent performance issues or temporary unavailability when accessing the Alation POV Instance. We are actively working to address the issue and will keep you updated on the progress.
-
identified Jul 31, 2024, 02:19 PM UTC
We are still actively working to restore service; further updates will be provided as they become available.
-
identified Jul 31, 2024, 03:48 PM UTC
Our team is still actively engaged in the incident, and we will keep you posted once we have further information.
-
resolved Jul 31, 2024, 04:08 PM UTC
The intermittent issue with the POV cluster has been resolved, and we are actively monitoring to ensure the service's stability.
Read the full incident report →
- Detected by Pingoru
- Jun 30, 2024, 09:00 AM UTC
- Resolved
- Jun 30, 2024, 12:50 PM UTC
- Duration
- 3h 49m
Affected: Americas (US-east)
Timeline · 2 updates
-
identified Jun 30, 2024, 10:21 AM UTC
We're experiencing an upgrade issue affecting a small number of users in the US-east cluster. Our team is actively working to resolve the problem and restore full functionality. We'll continue to provide updates on our progress here.
-
resolved Jun 30, 2024, 12:52 PM UTC
The upgrade issue that effected a small set of customer has been resolved.
Read the full incident report →