Dstny incident
Call2Teams – Sync Removing Users due to database limit reached
Dstny experienced a minor incident on August 19, 2025 affecting AP and AUS and 1 more component, lasting 2d 2h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Aug 19, 2025, 12:38 PM UTC
We are currently investigating an issue affecting Call2Teams across multiple regions. This is resulting in sync problems, causing users to be removed and appear as unattached within the Call2Teams portal. Consequently, affected users are experiencing a loss of registrations. Our teams are actively working to identify the root cause and implement a resolution. Updates will be provided every 60 minutes as more information becomes available. We sincerely apologise for any inconvenience caused and appreciate your patience during this time. Kind regards, Dstny Support
- investigating Aug 19, 2025, 01:10 PM UTC
We continue to investigate this incident in collaboration with our Platform team and are actively implementing measures to minimise user impact wherever possible. As part of the mitigation efforts, affected customers have been advised to perform another sync to restore user associations. Please note that users will only reattach to existing PBX components if there is a matching characteristic (such as email address or external ID). Where no match is found, users may need to be manually recreated. We will provide a further update within the next 60 minutes. Thank you for your continued patience. Kind regards, Dstny Support
- investigating Aug 19, 2025, 02:41 PM UTC
We continue to work closely with our Platform team to investigate this incident and minimise disruption to affected users. As part of our ongoing efforts, we are reviewing potentially impacted accounts and actively working with confirmed affected customers to restore data and ensure service continuity. We will provide a further update within the next 60 minutes. Thank you for your continued patience. Kind regards, Dstny Support
- investigating Aug 19, 2025, 03:47 PM UTC
Our Platform team has identified the root cause of the issue, and affected accounts now appear to be back online. No further action is required at this time. We are planning future improvements to prevent similar issues from occurring again and will continue to monitor service availability over the next 24 hours. We do not anticipate any further impact. Thank you for your continued patience. Kind regards, Dstny Support
- monitoring Aug 20, 2025, 07:55 AM UTC
A fix has been implemented and we are monitoring the results.
- monitoring Aug 20, 2025, 01:02 PM UTC
Our Platform team continues to monitor service availability while investigating the issue. We’re planning improvements to prevent similar incidents in future. No further reports of impact have been received, and we do not anticipate any disruption to live services. Thank you for your continued patience. Kind regards, Dstny Support
- monitoring Aug 21, 2025, 02:54 PM UTC
We are continuing to monitor for any further issues.
- monitoring Aug 21, 2025, 02:55 PM UTC
We are pleased to confirm that this incident has been fully resolved. Over the past 24 hours, we have closely monitored the situation and observed no recurrence or further impact. We have determined the root cause and have implemented measures to prevent future occurrences. To provide transparency and insight into the incident, a detailed post-mortem report will be made available within the next 5 business days. We sincerely apologize for any inconvenience caused and thank you for your patience and understanding throughout this incident. Should you have any further questions or concerns, please feel free to reach out to our support team. Thank you, Dstny Support
- resolved Aug 21, 2025, 02:55 PM UTC
This incident has been resolved.
- postmortem Oct 16, 2025, 09:38 AM UTC
**Major Incident Category** Service Degradation **Post Mortem Owner** Ant Hurlock **Date Post Mortem Completed \(UTC\)** 27 Aug 2025, 15:00 **Incident Summary** Between 16:30 UTC on 18 August and 13:10 UTC on 19 August 2025, a failure in the sync process logic led to the unintended deletion of users from the Call2Teams platform. During a period of high load, the system encountered a SQL error caused by exceeding the database’s connection limit. This error disrupted a critical safeguard designed to prevent user deletions, but the sync process continued running instead of halting. As a result, service users were erroneously removed from the database. Affected users became unattached and unlicensed, which prevented access to Call2Teams and Call2Teams Go services throughout the incident. The issue was identified through partner support tickets and confirmed by the Call2Teams Engineering team. A subsequent sync restored most users, while manually configured accounts required customer intervention to fully restore services. **Root Cause** The incident was caused by a failure in the system’s sync process during a period of high activity. The platform relies on Microsoft Graph to query user data, but a spike in invalid location responses triggered widespread error handling. This placed unexpected load on the database, which then failed to connect properly at a critical point in the sync. As a result, the system was unable to correctly flag users who should have been retained, and mistakenly removed them, leading to service disruption. **Incident Resolution** Once the issue was identified, our Engineering team recovered a backup of the user database to identify all affected accounts. A manual sync process was then used to restore services for impacted users. While most accounts were successfully recovered, some required partner intervention to complete the sync and fully restore access. **Mitigative Actions** To prevent recurrence, a proactive alert was introduced to detect similar issues early and enable faster response. The affected database was manually scaled to improve performance under high load. Updates to retry logic are scheduled for release, and improvements to safeguard logic are currently being planned, with implementation date to be confirmed. ### **Timeline \(UTC\)** * 18 Aug 2025, 16:30 - Incident begins; elevated database load observed. * 18 Aug 2025, 17:30 - Database load returns to normal naturally. * 19 Aug 2025, 08:30 - Support query raised; initial investigation initiated by CX Engineering. * 19 Aug 2025, 09:30 - Re-syncing identified as a potential recovery method. * 19 Aug 2025, 12:17 - Major Incident raised; partner communications issued via Zone shortly after to inform stakeholders. * 19 Aug 2025, 13:00 - Backup from 18th Aug used to assist with affected accounts. * 19 Aug 2025, 13:10 - Mitigation identified; customers advised to perform sync to restore users. * 19 Aug 2025, 13:45 - Incident moved to monitoring; platform improvements initiated. * 20 Aug 2025, 16:00 - Root cause analysis identifies database load triggered by exception handling. * 21 Aug 2025, 15:30 - Error handling confirmed; database scaled to reduce future risk.