Ports Group incident

Network connectivity issues

Critical Resolved View vendor source →

Ports Group experienced a critical incident on May 13, 2025, lasting 36m. The incident has been resolved; the full update timeline is below.

Started
May 13, 2025, 09:55 AM UTC
Resolved
May 13, 2025, 10:31 AM UTC
Duration
36m
Detected by Pingoru
May 13, 2025, 09:55 AM UTC

Update timeline

  1. investigating May 13, 2025, 09:55 AM UTC

    We are currently experiencing connectivity issues. This outage is affecting all on-prem services.

  2. investigating May 13, 2025, 10:09 AM UTC

    Network connectivity has returned to normal. We're investigating root cause.

  3. monitoring May 13, 2025, 10:09 AM UTC

    Upstream provider confirmed no issues on their end. Investigation continues.

  4. resolved May 13, 2025, 10:31 AM UTC

    The incident is resolved

  5. postmortem May 13, 2025, 12:31 PM UTC

    ### 1. Incident Summary Between **11:41:00 UTC\+2** and **12:03:00 UTC\+2** on **2025-05-13**, all traffic that traverses our on-prem firewall experienced severe to complete packet drops. The disruption stemmed from a previously unknown software defect in the firewall OS. The defect also caused fail-over to secondary firewall to break. Service was fully restored after a automated reboot of primary firewall. * **Duration:** 22 minutes total impact * **Affected services:** Web Hosting, Virtual Server Platform, URL Redirector, Abion Core, Hosted Exchange, Standard Email \(IMAP\) * **Customer impact:** Websites and servers unreachable, SMTP delivery delays up to 20 min ### 2. Detailed Timeline \(UTC\+2\) | Time | Event | | --- | --- | | **11:41** | Person on-call receiving multiple DOWN alerts | | **11:50** | Ticket raised with NOC, no upstream issues found | | **12:03** | Network restores automatically as firewall reboots | | **12:05** | Root cause analysis begins | | **12:45** | An acknowledged software defect in our firewall identified as root cause | | **14:20** | Scheduled maintenance planned for fireware upgrade, subscribers notified | ### 3. Impact * **Services affected:** All on-prem platforms \(Web Hosting, VPS, Abion Core, Hosted Exchange, Standard Email/IMAP, URL Redirector\). * **Customer experience:** Complete service outage; e-mail queuing/delays. ### 4. Root Cause A **verified firmware defect in our firewall OS** triggers an out-of-memory crash during normal traffic, simultaneously breaking HA fail-over. Vender documents issue is fixed in a later release. ### 5. Detection & Response * **Detection:** Multiple Tech Department staff simultaneously observed loss of connectivity. * **Immediate response:** Raised ticket with ISP; began on-prem investigation; reviewed firewall logs revealing repeated OOM events. ### 6. Resolution Service returned when the primary firewall auto-rebooted \(12:03\). Incident completed by 12:44 after correlating log entries with vendor advisory. ### 7. Preventive / Corrective Actions We’re upgrading both our firewalls next maintenance window due 29 May 2025.