Sherpany incident

Connectivity Issues

Major Resolved View vendor source →

Sherpany experienced a major incident on November 22, 2019 affecting Sherpany Application and Sherpany API, lasting 1d 4h. The incident has been resolved; the full update timeline is below.

Started
Nov 22, 2019, 09:25 AM UTC
Resolved
Nov 23, 2019, 01:38 PM UTC
Duration
1d 4h
Detected by Pingoru
Nov 22, 2019, 09:25 AM UTC

Affected components

Sherpany ApplicationSherpany API

Update timeline

  1. investigating Nov 22, 2019, 09:25 AM UTC

    Our hosting provider is currently facing network issues. This is affecting the availability and performance of our services. We are investigating and will keep you updated. We are sorry for the inconvenience this may cause.

  2. identified Nov 22, 2019, 10:18 AM UTC

    The issue has been identified. We're preparing a failover to our datacenter failover site to bring our services back online again. We will update you again in one hour.

  3. monitoring Nov 22, 2019, 01:26 PM UTC

    Our hosting provider confirmed that the network is working normally again. However, it is still possible for performance or availability issues to recur. We continue investigating the issue and will take further measures as needed.

  4. monitoring Nov 22, 2019, 05:07 PM UTC

    Right now all systems are operational and stable the whole afternoon. We are still keeping a close eye on our infrastructure. There are a number of staff-members that will continue to work on this issue over the weekend, and we have made sure that there are several employees of Sherpany, our cloud provider and the vendor of the network equipment are on standby should the situation require more immediate action. We are extremely sorry for the trouble this issue has caused. A detailed post-mortem report will follow.

  5. resolved Nov 23, 2019, 01:38 PM UTC

    The network in our primary datacenter has been stable for more than 24 hours now and we are convinced that it will remain that way. The network equipment provider installed the most robust solution. All our services are fully available ever since. We will follow up with a detailed incident report by mid next week the latest. Please accept our sincere apologies for all the inconvenience this may have caused.