SuperOffice experienced a critical incident on March 14, 2025 affecting Login services and Sales & Marketing Client and 1 more component, lasting 7h 44m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Mar 14, 2025, 02:27 PM UTC
We are currently investigating Network disruption that is causing intermittent availability issues with the SuperOffice CRM Cloud
- monitoring Mar 14, 2025, 02:51 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Mar 14, 2025, 10:12 PM UTC
This incident has been resolved.
- postmortem Mar 18, 2025, 09:22 AM UTC
The incident in the datacenter was triggered by an overload of monitoring resources on a specific switch model \(N9K-C93180YC-FX3\). This switch was configured to handle multiple types of SPAN sessions \(Switched Port Analyzer\), which are used to monitor network traffic. However, the switch reached its hardware limit for handling these sessions, leading to a failure. The fault \(F3849\) indicated that the SPAN limit was exceeded, triggering a reboot due to a known bug in the system. This bug was activated when we tried to set up a combination of monitoring sessions and then removed some of them. This action caused the bug to affect all similar switches in the network. In simpler terms, the switch was asked to do more monitoring than it could handle, which led to a failure and reboot. The specific bug in the system was triggered by the way we configured and then changed the monitoring settings, causing widespread issues across all similar switches at the same time.