TestRail incident

Some instances are currently showing maintenance mode.

Major Resolved View vendor source →

TestRail experienced a major incident on September 28, 2021 affecting TestRail Cloud, lasting 41d 4h. The incident has been resolved; the full update timeline is below.

Started
Sep 28, 2021, 12:56 PM UTC
Resolved
Nov 08, 2021, 05:06 PM UTC
Duration
41d 4h
Detected by Pingoru
Sep 28, 2021, 12:56 PM UTC

Affected components

TestRail Cloud

Update timeline

  1. investigating Sep 28, 2021, 12:56 PM UTC

    We are currently investigating this issue.

  2. identified Sep 28, 2021, 01:32 PM UTC

    The issue has been identified and a fix is being implemented.

  3. identified Sep 28, 2021, 05:03 PM UTC

    We are continuing to implement the fix identified earlier today, and service has been restored to a subset of affected instances. We will provide an update as soon as we have more information.

  4. identified Sep 29, 2021, 07:34 AM UTC

    We’re continuing to work with AWS Support and we will keep you updated as the situation progresses.

  5. monitoring Sep 29, 2021, 10:15 AM UTC

    All affected instances are now functioning. Data has been restored to the affected instances from a backup prior to the crash: September 28, 2021, 1:11:11 AM UTC. We are working to restore any changes after this time to the affected instances, and will continue to provide updates on progress as we make it.

  6. monitoring Sep 29, 2021, 02:16 PM UTC

    Per our earlier update, all affected instances should now be accessible. It’s possible that affected instances may still be experiencing challenges with lost data however. This is due to the fact that the instances have been restored from a backup prior to the crash. The backup used has the following timestamp: September 28, 2021, 1:11:11 AM UTC. What this means: Data added to your instance after the timestamp but before the outage may not have been fully restored to your instance. We are continuing to work with AWS to recover and restore data added to your instance after the backup timestamp, but this process is expected to take a little while. We will update this status once we have completed our work with AWS to restore all of your data. We appreciate your patience and apologise for any inconvenience caused in the meantime.

  7. monitoring Sep 30, 2021, 03:06 PM UTC

    All Cloud systems are operational, but we are continuing to work with Amazon to recover and restore data added to affected instances after the backup timestamp, September 28, 2021, 1:11:11 AM UTC. We will post another update at or before September 30, 2021, (today) at 22:30 UTC, unless we receive more information before that.

  8. monitoring Sep 30, 2021, 11:04 PM UTC

    There are no new updates at this time. We are continuing to work with Amazon to recover and restore data added to affected instances after the backup timestamp, September 28, 2021, 1:11:11 AM UTC. We will post another update at or before October 1, 2021, at 15:00 UTC, unless we receive more information before that.

  9. monitoring Oct 01, 2021, 06:58 PM UTC

    There are no new updates at this time. We are continuing to work with Amazon to recover and restore data added to affected instances after the backup timestamp, September 28, 2021, 1:11:11 AM UTC. Please note that you are still able to use your instance at this time, as we will be working to merge the current data after the restore has completed. We will post another update on October 4, 2021, at 15:00 UTC, unless we receive more information before that.

  10. monitoring Oct 04, 2021, 11:09 PM UTC

    We have successfully recovered data that was added to affected instances between September 28, 2021, 1:11:11 AM UTC and September 28, 2021, 12:12PM UTC. We will reach out to provide more details about how we will be restoring recovered data to affected TestRail instances once we have more information. Please note that you are still able to use your instance at this time, as we will be working to merge your current data with the restored data.

  11. resolved Nov 08, 2021, 05:06 PM UTC

    We are working directly with affected customers and/or customer instances and will be restoring data to them over the next ~2 weeks. We will contact affected customers directly to confirm when this has been done. Since instances are functioning correctly at this time, we are treating this incident as resolved and no further updates will be made here.