Nebula incident
CallSwitch and CallSwitchOne platform experiencing calls down.
Nebula experienced a major incident on May 5, 2025 affecting Core Network, lasting 54m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating May 05, 2025, 03:00 PM UTC
We have found the root-cause of the issue. Fixing in-progress Next update in 5minutes..
- investigating May 05, 2025, 03:08 PM UTC
We are continuing to investigate this issue.
- investigating May 05, 2025, 03:20 PM UTC
We are still experiencing issues with outbound calls. The issue is under investigation. Next update shortly.
- investigating May 05, 2025, 03:45 PM UTC
We are continuing to investigate this issue.
- resolved May 05, 2025, 03:55 PM UTC
The issue is now resolved. All the services are up and running now.
- postmortem May 07, 2025, 02:11 PM UTC
**What Happened** At 03:10 PM BST, our monitoring systems detected connectivity issues between the database and Nebula platform which occurred due to an unexpected network refresh. As the network reconnected, the database encountered an authentication error. This caused a connection issue for inbound and outbound calls for 90mins. **Impact** The inbound and outbound calls were impacted and couldn't be connected through the platform. **Resolution** We took immediate corrective measures, which involved identifying the root-cause of the issue \(DNS related\). Services were restored within 90mins with no further delays encountered in the call flow. A subsequent 24-hour monitoring period ensured no recurrence, allowing us to investigate and confirm the nature of the fault. In the wake of the incident, we have implemented additional DNS and connectivity health-checks which run every minute. **Next Steps** The implemented corrective and preventative measures will significantly reduce the risk of similar incidents in the future. We are committed to providing reliable and high-quality voice to our partners and customers and will continue to review and improve our platform's resilience and performance.