Mercatus experienced a major incident on July 12, 2025 affecting Integration, lasting 17h 31m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jul 12, 2025, 09:03 AM UTC
We are currently experiencing issues with platform integrations, resulting in failed or delayed integration runs. Our integration team began investigating the issue and is actively working to identify the root cause and implement a fix. We will share updates as soon as more information becomes available. We appreciate your patience and understanding.
- identified Jul 12, 2025, 09:54 AM UTC
🔄 Update: Integration Failures Status: Identified Our team has identified a connection issue with the database that is impacting integration functionality. The DevOps team is actively investigating the root cause. As a precaution, we have temporarily paused the polling recipes to prevent further failures. The estimated time for resolution is 2 PM ET (July 12, 2025). We will continue to provide updates as we work toward restoring full functionality. Thank you for your continued patience.
- resolved Jul 13, 2025, 02:34 AM UTC
The connection issue with the database has been successfully resolved. All integration services are now functioning as expected, and the polling recipes have been re-enabled. We have verified system stability and are continuing to monitor performance to ensure everything remains operational. Thank you for your patience and understanding throughout this incident.
- postmortem Aug 05, 2025, 10:57 AM UTC
### 🔍 Summary On July 12, 2025, a product patch release triggered widespread failures across platform integrations, specifically affecting recipes and SFTP processes. The issue stemmed from a database connection failure and unintended execution of SFTP recipes, leading to queue congestion and delayed data ingestion. ### 📌 Root Cause * A **product patch** released on July 12 caused: * Failures in all recipes directly connected to the database. * Unintended execution of SFTP recipes, which processed all available files from Client’s MA SFTP servers simultaneously. * This led to **queue congestion** and disrupted integration workflows. ### 📊 Impact Summary **Affected Systems:** * Integration recipes \(DB connectors\) * SFTP integrations with client **Impacted Clients:** * Clients using integration services **Business Impact:** * Delayed and out-of-order data ingestion. * Manual intervention required to reset queues. * Risk of inaccurate downstream reporting and analytics. ### ⚠️ Contributing Factors * **No file archiving** during SFTP server migration. * **OS upgrade** from AWS Linux 2 to AWS Linux 2023 introduced a new network interface \(`ens5` instead of `eth0`\). * Existing NAT masquerading scripts were hardcoded to `eth0`, causing failures until updated. ### 🛠️ Mitigation & Resolution * **Immediate Actions:** * Paused polling recipes to prevent further failures. * DevOps team initiated investigation and identified the database connection issue. * **Resolution:** * Database connection issue resolved. * Polling recipes re-enabled. * System stability verified and performance monitoring initiated. ### ✅ Follow-Up Actions * Review and update all scripts to avoid hardcoded network interface dependencies. * Implement file archiving protocols during SFTP migrations. * Enhance patch testing procedures to include integration impact assessments. * Improve alerting and monitoring for recipe execution anomalies.