DataGrail incident
Service disruption during routine database upgrade
DataGrail experienced a critical incident on June 11, 2021 affecting Request Manager and Live Data Map and 1 more component, lasting 28m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Jun 11, 2021, 03:30 AM UTC
A service disruption started around 8:11 PM PT during a routine database upgrade. We are actively resolving the issue and should be back up shortly. Updates to follow.
- monitoring Jun 11, 2021, 03:45 AM UTC
The service disruption has been resolved and all systems are back online. We are actively monitoring.
- resolved Jun 11, 2021, 03:58 AM UTC
The incident has been resolved. Please contact [email protected] if you have questions or would like a copy of our root cause analysis (once available).
- postmortem Jun 15, 2021, 06:10 PM UTC
On Friday, June 11, 2021, DataGrail experienced a service disruption of privacy intake forms and application downtime lasting from 8:08PM PST until 8:43PM PST. This was caused by a routine upgrade to apply security and performance improvements to DataGrail’s core database infrastructure. DataGrail Engineering had previously performed similar upgrades across its test environments and had not observed a service interruption. The DataGrail team has identified several measures to prevent this failure from occurring in the future. These include implementing a two-phase migration process for database upgrades that will avoid downtime due to delayed maintenance as well as enhanced failure mode benchmarking in our test environments. Please contact [[email protected]](mailto:[email protected]) if you have any questions or concerns.