Retreaver incident

Reporting is down

Major Resolved View vendor source →

Retreaver experienced a major incident on January 11, 2018, lasting 25m. The incident has been resolved; the full update timeline is below.

Started
Jan 11, 2018, 06:07 PM UTC
Resolved
Jan 11, 2018, 06:32 PM UTC
Duration
25m
Detected by Pingoru
Jan 11, 2018, 06:07 PM UTC

Update timeline

  1. investigating Jan 11, 2018, 06:07 PM UTC

    Calls continue to be processed as normal. We are working on a fix.

  2. monitoring Jan 11, 2018, 06:18 PM UTC

    Reporting is back up and we're fixing the issue of missing calls in the search index - zero data has been lost. We'll be fully reindexed shortly. Our apologies for the inconvenience.

  3. monitoring Jan 11, 2018, 06:31 PM UTC

    Reporting is back up. At approximately 6:45 am HST our Solr server experienced an error writing to AWS Elastic Block Storage. This issue unfortunately failed to trigger our monitoring systems. Retreaver is architected so that a failure in our search servers will not affect call processing or any other functionality. Once aware of the failure, the server was rebooted to restore access to EBS, and the search index was updated. We regret that our monitoring did not flag the error and we apologize for the interruption to our customers. We're working to improve our monitoring so that this type of incident can be handled more expediently in the future.

  4. resolved Jan 11, 2018, 06:32 PM UTC

    This incident has been resolved.