BlueFox Host incident

Widespread outage

Critical Resolved View vendor source →

BlueFox Host experienced a critical incident on June 9, 2021 affecting Dallas, US/1 (BHDA1), lasting 1d. The incident has been resolved; the full update timeline is below.

Started
Jun 09, 2021, 08:54 PM UTC
Resolved
Jun 10, 2021, 08:58 PM UTC
Duration
1d
Detected by Pingoru
Jun 09, 2021, 08:54 PM UTC

Affected components

Dallas, US/1 (BHDA1)

Update timeline

  1. identified Jun 09, 2021, 08:54 PM UTC

    A response team has been assigned to the incident and are working on restoring full access to the services.

  2. identified Jun 09, 2021, 10:15 PM UTC

    We are continuing to work on a fix for this issue.

  3. identified Jun 09, 2021, 10:44 PM UTC

    We are continuing to work on a fix for this issue.

  4. identified Jun 09, 2021, 10:48 PM UTC

    This issue is partially resolved, we're awaiting manual intervention from the datacenter. - bh007 - bh007 is unavailable - A hard reset will resolve the issue - There is no way for the control plane to communicate with this server, nor remote management software - bh009 - bh009 is having some monitoring issues but is operating as expected - bh004 (which is running the reverse proxy) is operating as expected - The server is currently rate limited by ZeroSSL and LetsEncrypt - This is causing SSL errors/issues to occur - The process is still on-going - About 1/4 of sites have been restored - The SSL issues should be resolved within 48 hours - No manual intervention is possible due to the scale We will inform you on any further changes to the incident

  5. identified Jun 10, 2021, 04:16 PM UTC

    We are doing monitoring the situation with the reverse proxy on bh004 to ensure it's operating as expected.

  6. monitoring Jun 10, 2021, 08:22 PM UTC

    A fix has been implemented and we are monitoring the results.

  7. monitoring Jun 10, 2021, 08:56 PM UTC

    - bh007 - bh007 is operating as expected - bh009 - bh009 is operating as expected - bh004 (which is running the reverse proxy) is operating as expected - The server is currently rate limited by ZeroSSL and LetsEncrypt - This is causing SSL errors/issues to occur - The process is still on-going - About 1/4 of sites have been restored - The SSL issues should be resolved within 48 hours - No manual intervention is possible due to the scale

  8. resolved Jun 10, 2021, 08:58 PM UTC

    - bh007 -is operating as expected - bh009 - is operating as expected - bh004 -is operating as expected