Gurobi Optimization experienced a minor incident on March 9, 2021 affecting Gurobi Website, lasting 2d. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Mar 09, 2021, 01:07 AM UTC
Currently, Gurobi registrations for new website accounts are severely delayed. This was caused by a server upgrade, which broke synchronization between the registration system and the website accounts. A complete resync is underway, which will take several hours to complete. If you need urgent assistance, please send email to [email protected] and we will try to help. Existing website accounts are not affected.
- identified Mar 09, 2021, 01:07 AM UTC
We are continuing to work on a fix for this issue.
- identified Mar 09, 2021, 06:37 AM UTC
We are continuing to work on a fix for this issue.
- identified Mar 09, 2021, 09:36 AM UTC
The repair is underway and should be resolved in approximately 21 hours. Also, some customers are receiving emails with a link to reset your website password. This is related to the issue; you can ignore these messages.
- identified Mar 09, 2021, 05:49 PM UTC
The sync repair is still underway; current ETA is 12:00 am PT / 3:00 am ET / 08:00 GMT / 09:00 CET.
- identified Mar 10, 2021, 08:00 PM UTC
1 of 2 synchronizations completed about 12 hours ago; the second is underway and should be complete within 20 hours; we're investigating various methods to speed this up.
- identified Mar 11, 2021, 01:52 AM UTC
Data synchronization is complete and all systems are back to normal. We hope to write a post-mortem later this week.
- resolved Mar 11, 2021, 01:53 AM UTC
This incident has been resolved.
- postmortem Mar 11, 2021, 08:01 PM UTC
On Monday, we started a cluster of new CRM servers to handle growing business. However, the new servers created a race condition that corrupted data when multiple servers simultaneously tried to synchronize updates with our marketing automation system. This prevented new users from accessing the Gurobi website. We shut down the new CRM servers and started a full sync to repair the link with the marketing automation system. The sync logs identified the corrupted records; once we repaired those records, the sync completed and all systems were restored. We updated the procedure to start the new CRM servers to avoid this race condition, and we documented the process to repair sync should it get corrupted again. Furthermore, we will reschedule the CRM server upgrade for off-peak hours. We appreciate your patience during this incident.