TechnologyOne incident

Service Degradation for a subset of customers in ANZ Region across multiple releases

Major Resolved View vendor source →

TechnologyOne experienced a major incident on February 4, 2025 affecting Ci Anywhere, lasting 4h 9m. The incident has been resolved; the full update timeline is below.

Started
Feb 04, 2025, 01:23 AM UTC
Resolved
Feb 04, 2025, 05:33 AM UTC
Duration
4h 9m
Detected by Pingoru
Feb 04, 2025, 01:23 AM UTC

Affected components

Ci Anywhere

Update timeline

  1. investigating Feb 04, 2025, 01:23 AM UTC

    We are investigating an issue impacting service for ANZ Region for all Releases. A subset of customers in ANZ Region for all Release are experiencing 504 gateway errors. Due to the investigation, the next update will be provided in 60 minutes, or sooner if new information becomes available.

  2. identified Feb 04, 2025, 01:35 AM UTC

    Our team has identified a fix which is being implemented now. We anticipate the implementation of the fix to take 60 minutes to complete and verify.

  3. monitoring Feb 04, 2025, 02:37 AM UTC

    Our team has verified the implementation of a fix is complete for the impacted production environments. We will monitor the logs for the next 2 hours to ensure no further customers are impacted.

  4. resolved Feb 04, 2025, 05:33 AM UTC

    After a few hours monitoring this incident is now resolved. We apologise for how you and your business may have been affected by this incident.

  5. postmortem Feb 13, 2025, 11:11 PM UTC

    **Issue Summary** On 4 February 2025 a partial failure in the network occurred at 11.00am AEST. This impacted a subset of customers, different to those impacted on 3 February 2025, who then experienced a range of 5XX errors intermittently. This disruption was up to 60 minutes. TechnologyOne team immediately applied mitigations developed on 3 February 2025 to recover. TechnologyOne teams continued to work and analyse the root cause and develop actions through to 10/2/2025 Preventative actions began being applied progressively across regions from 2pm on 10/2/2025. **Root Cause** A partial failure on managed DNS caused network instability leading to intermittent access to application infrastructure. **Corrective Actions** Undertook the mitigations \(developed on 3rd February\) to recover. The managed DNS was recovered by our upstream vendor. **Preventative Actions** Implemented configuration changes in managed DNS based on recommendations from our upstream provider who reviewed the workload patterns to provide additional resilience to the service.