TechnologyOne incident

Service degradation/ Outage for a subset of customers in ANZ Region for all Releases

TechnologyOne experienced a major incident on February 10, 2025 affecting Ci Anywhere, lasting 2h 50m. The incident has been resolved; the full update timeline is below.

Started: Feb 10, 2025, 03:38 AM UTC
Resolved: Feb 10, 2025, 06:29 AM UTC
Duration: 2h 50m
Detected by Pingoru: Feb 10, 2025, 03:38 AM UTC

Affected components

Ci Anywhere

Update timeline

monitoring Feb 10, 2025, 03:38 AM UTC

Our team has verified this disruption is resolved for the subset of customers in ANZ Region for all Releases. We will monitor the logs for the next 2 hours to ensure no further customers are impacted.
monitoring Feb 10, 2025, 03:39 AM UTC

We are continuing to monitor for any further issues.
investigating Feb 10, 2025, 03:50 AM UTC

We are investigating as our monitoring identified reoccurrence of the issue impacting service for a subset of customer in ANZ Region for all Releases. Due to the investigation, the next update will be provided in 60 minutes, or sooner if new information becomes available.
identified Feb 10, 2025, 03:55 AM UTC

Our team has identified a fix which is being implemented now. We anticipate the implementation of the fix to take 30 minutes to complete and verify.
monitoring Feb 10, 2025, 04:21 AM UTC

Our team has verified the implementation of the fix is complete. We will monitor the logs for the next 2 hours to ensure no further customers are impacted.
resolved Feb 10, 2025, 06:29 AM UTC

After 2 hours of monitoring this incident is now resolved. We apologise for how you and your business may have been affected by this incident.
postmortem Feb 13, 2025, 11:12 PM UTC

**Issue Summary** On 10 February 2025 a partial failure in the network occurred at 1.00pm AEST. This impacted a subset of customers, like those impacted on 3 February 2025, who then experienced a range of 5XX errors, and many users were unable to login. This disruption lasted for 2-3 mins for Production and 3-5 mins for non-production and auto recovered and then reoccurred 30 minutes later with a further 2-3 min disruption for Production and 3-5 min disruption for non-production and again auto recovered. Preventative actions began being applied progressively across regions from 2pm on 10/2/2025. **Root Cause** A partial failure on managed DNS caused network instability leading to intermittent access to application infrastructure. **Corrective Actions** The managed DNS was automatically recovered. **Preventative Actions** Implemented configuration changes in managed DNS based on recommendations from our upstream provider who reviewed the workload patterns to provide additional resilience to the service.