Quantum Workplace incident
All applications offline due to database outage.
Quantum Workplace experienced a critical incident on July 6, 2022, lasting —. The incident has been resolved; the full update timeline is below.
Update timeline
- resolved Jul 06, 2022, 08:12 PM UTC
12:54:07 (CDT) - The Azure host that runs our primary database server (PRD-EUS2-SQL01A) had an outage. The physical node where the VM was running encountered a hardware issue due to a faulty disk which impacted the VMs on the node. 12:55:00 (CDT) - The server was migrated to a new host 13:05:45 (CDT) - The server started back up on the new host 13:08:30 (CDT) - Traffic was able to be back online
- postmortem Jul 06, 2022, 08:12 PM UTC
Take aways: * Azure did successfully self-heal the outage as designed * We should have a nicer looking site down page instead of "502 gateway error" * We will re-evaluate enabling an auto-failover to our secondary database server \(trade-off is you can fail over more quickly, but have a chance at loss of data\)