Alation Cloud Service incident
Metadata Extraction failure with read timeout to airflow cluster - Elevated Error
Alation Cloud Service experienced a minor incident on October 29, 2024 affecting Americas (US-east) and Americas (US-west) and 1 more component, lasting 1d 19h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Oct 30, 2024, 06:39 PM UTC
We are currently investigating an issue with the MDE Pipeline service, which is preventing data extraction and causing errors. The error is related to a timeout connection to the pipeline service. Our team is working to resolve the issue as quickly as possible. We will keep you posted with the progress as it becomes available.
- investigating Oct 30, 2024, 08:17 PM UTC
The issue is impacting US-east region only. All other regions are fully operational. Following error message may be seen in impacted region. "HTTPConnectionPool(host='airflow-pipeline-service.default.svc.cluster.local', port=80): Read timed out. (read timeout=1800) "
- monitoring Oct 31, 2024, 03:40 AM UTC
Our engineering team has successfully resolved the issue causing the timeout connection to the Airflow pipeline service, and the system is now functioning as expected.
- resolved Oct 31, 2024, 06:38 PM UTC
The incident had been resolved and we have not seen the error reoccur during our monitoring period.