TestRail incident

Some instances are currently showing maintenance mode.

TestRail experienced a major incident on September 28, 2021 affecting TestRail Cloud, lasting 41d 4h. The incident has been resolved; the full update timeline is below.

Started: Sep 28, 2021, 12:56 PM UTC
Resolved: Nov 08, 2021, 05:06 PM UTC
Duration: 41d 4h
Detected by Pingoru: Sep 28, 2021, 12:56 PM UTC

Affected components

TestRail Cloud

Update timeline

investigating Sep 28, 2021, 12:56 PM UTC

We are currently investigating this issue.
identified Sep 28, 2021, 01:32 PM UTC

The issue has been identified and a fix is being implemented.
identified Sep 28, 2021, 05:03 PM UTC

We are continuing to implement the fix identified earlier today, and service has been restored to a subset of affected instances. We will provide an update as soon as we have more information.
identified Sep 29, 2021, 07:34 AM UTC

We’re continuing to work with AWS Support and we will keep you updated as the situation progresses.
monitoring Sep 29, 2021, 10:15 AM UTC

All affected instances are now functioning. Data has been restored to the affected instances from a backup prior to the crash: September 28, 2021, 1:11:11 AM UTC. We are working to restore any changes after this time to the affected instances, and will continue to provide updates on progress as we make it.
monitoring Sep 29, 2021, 02:16 PM UTC

Per our earlier update, all affected instances should now be accessible. It’s possible that affected instances may still be experiencing challenges with lost data however. This is due to the fact that the instances have been restored from a backup prior to the crash. The backup used has the following timestamp: September 28, 2021, 1:11:11 AM UTC. What this means: Data added to your instance after the timestamp but before the outage may not have been fully restored to your instance. We are continuing to work with AWS to recover and restore data added to your instance after the backup timestamp, but this process is expected to take a little while. We will update this status once we have completed our work with AWS to restore all of your data. We appreciate your patience and apologise for any inconvenience caused in the meantime.
monitoring Sep 30, 2021, 03:06 PM UTC

All Cloud systems are operational, but we are continuing to work with Amazon to recover and restore data added to affected instances after the backup timestamp, September 28, 2021, 1:11:11 AM UTC. We will post another update at or before September 30, 2021, (today) at 22:30 UTC, unless we receive more information before that.
monitoring Sep 30, 2021, 11:04 PM UTC

There are no new updates at this time. We are continuing to work with Amazon to recover and restore data added to affected instances after the backup timestamp, September 28, 2021, 1:11:11 AM UTC. We will post another update at or before October 1, 2021, at 15:00 UTC, unless we receive more information before that.
monitoring Oct 01, 2021, 06:58 PM UTC

There are no new updates at this time. We are continuing to work with Amazon to recover and restore data added to affected instances after the backup timestamp, September 28, 2021, 1:11:11 AM UTC. Please note that you are still able to use your instance at this time, as we will be working to merge the current data after the restore has completed. We will post another update on October 4, 2021, at 15:00 UTC, unless we receive more information before that.
monitoring Oct 04, 2021, 11:09 PM UTC

We have successfully recovered data that was added to affected instances between September 28, 2021, 1:11:11 AM UTC and September 28, 2021, 12:12PM UTC. We will reach out to provide more details about how we will be restoring recovered data to affected TestRail instances once we have more information. Please note that you are still able to use your instance at this time, as we will be working to merge your current data with the restored data.
resolved Nov 08, 2021, 05:06 PM UTC

We are working directly with affected customers and/or customer instances and will be restoring data to them over the next ~2 weeks. We will contact affected customers directly to confirm when this has been done. Since instances are functioning correctly at this time, we are treating this incident as resolved and no further updates will be made here.