House of Control incident

Service Disruption in Production Environment

Critical Resolved View vendor source →

House of Control experienced a critical incident on September 4, 2024 affecting Complete Control, Region - Norway and House of Control Website and 1 more component, lasting 7h 6m. The incident has been resolved; the full update timeline is below.

Started
Sep 04, 2024, 07:27 AM UTC
Resolved
Sep 04, 2024, 02:33 PM UTC
Duration
7h 6m
Detected by Pingoru
Sep 04, 2024, 07:27 AM UTC

Affected components

Complete Control, Region - NorwayHouse of Control WebsiteComplete Control, Region - DenmarkComplete Control, Region - Cloud

Update timeline

  1. investigating Sep 04, 2024, 07:27 AM UTC

    We are currently experiencing a service disruption in our production environment. Our technical team are working to identify, and to resolve it as quickly as possible. During this period, some services may be unavailable or experience degraded performance. We will provide updates on the progress and notify you as soon as the issue has been resolved. We apologize for any inconvenience this may cause and appreciate your patience and understanding. If you have any questions or need further information, please don’t hesitate to reach out.

  2. investigating Sep 04, 2024, 07:58 AM UTC

    UPDATE: Ongoing Service Disruption in Production Environment! The issue has been identified. Our technical teams are continuing to work on resolving the service disruption in our production environment. We will provide the next update in 30 minutes. Thank you for your continued patience.

  3. monitoring Sep 04, 2024, 08:15 AM UTC

    A fix has been implemented and we are monitoring the results.

  4. investigating Sep 04, 2024, 08:23 AM UTC

    Update! The issue has been identified and resolved. The solution should now be functioning as expected. We are still monitoring the system to ensure everything is functioning as it should. We apologize for any inconvenience this may have caused.

  5. resolved Sep 04, 2024, 02:33 PM UTC

    The issue has been resolved. An improvement has been implemented and successfully tested. All services are now operating normally. Thank you for your patience. If you encounter any further issues, please reach out to our support team.

  6. postmortem Sep 04, 2024, 02:34 PM UTC

    ### **Incident Name**: Service Disruption in Production Environment ### **Date and Time**: September 4th, 08:55 ### **Duration**: 1 hour and 15 minutes ### **Affected Services**: Login to Complete Control ### **Severity Level**: Priority 1 ### **What Happened**: During an upgrade by our hosting provider, a configuration error was introduced in the firewalls/gateway software within the core infrastructure. As a result, customer internet traffic was unable to reach its destination. ### **Detection**: The issue was first reported by House of Control’s Customer Success team. ### **Resolution Timeline**: * **08:55** – Servers lost connection to the internet due to a misconfiguration in the network's high-availability \(HA\) setup. * **09:10** – Solution team was notified. * **09:30** – Solution team contacted the hosting provider for status verification. * **09:55** – Hosting provider began working on the infrastructure fix. * **10:05** – Solution team received instructions from the hosting provider to resolve the issue. * **10:10** – Production servers were restored to normal operations. ### **Root Cause**: A misconfiguration in the firewall/gateway setup during an upgrade by the hosting provider caused internet traffic disruptions for customers. ### **Resolution**: The hosting provider’s CTO and network engineers resolved the configuration issue. They ensured that both internal and external IP addresses are correctly handled and that failover between gateways works as intended to prevent similar disruptions.