SuperOffice incident

Network disruption

Critical Resolved View vendor source →

SuperOffice experienced a critical incident on March 14, 2025 affecting Login services and Sales & Marketing Client and 1 more component, lasting 7h 44m. The incident has been resolved; the full update timeline is below.

Started
Mar 14, 2025, 02:27 PM UTC
Resolved
Mar 14, 2025, 10:12 PM UTC
Duration
7h 44m
Detected by Pingoru
Mar 14, 2025, 02:27 PM UTC

Affected components

Login servicesSales & Marketing ClientService ClientWebToolsAPI and Apps

Update timeline

  1. investigating Mar 14, 2025, 02:27 PM UTC

    We are currently investigating Network disruption that is causing intermittent availability issues with the SuperOffice CRM Cloud

  2. monitoring Mar 14, 2025, 02:51 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Mar 14, 2025, 10:12 PM UTC

    This incident has been resolved.

  4. postmortem Mar 18, 2025, 09:22 AM UTC

    The incident in the datacenter was triggered by an overload of monitoring resources on a specific switch model \(N9K-C93180YC-FX3\). This switch was configured to handle multiple types of SPAN sessions \(Switched Port Analyzer\), which are used to monitor network traffic. However, the switch reached its hardware limit for handling these sessions, leading to a failure. The fault \(F3849\) indicated that the SPAN limit was exceeded, triggering a reboot due to a known bug in the system. This bug was activated when we tried to set up a combination of monitoring sessions and then removed some of them. This action caused the bug to affect all similar switches in the network. In simpler terms, the switch was asked to do more monitoring than it could handle, which led to a failure and reboot. The specific bug in the system was triggered by the way we configured and then changed the monitoring settings, causing widespread issues across all similar switches at the same time.