TechnologyOne incident
Spatial Service Disruption for a subset of customers ANZ Region
TechnologyOne experienced a major incident on February 17, 2025 affecting Spatial Cloud, lasting 11h 50m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Feb 17, 2025, 10:53 AM UTC
Our team has identified a subsequent issue to the outage earlier today for a subset of customers impacting Spatial Services We are currently applying fixes to the impacted environments with the goal to have the fix applied to all affected customers prior to the overnight processing commencing at 11.30PM. We anticipate the implementation of the fix to take 2 hours to complete and verify. The next update will be posted when the implementation of the fixes has been completed.
- monitoring Feb 17, 2025, 12:57 PM UTC
Our team has verified the implementation of a fix is complete for the impacted environments We will monitor the logs for the next few hours to ensure no further customers are impacted.
- resolved Feb 17, 2025, 10:43 PM UTC
After overnight monitoring this incident is now resolved. We will perform a post incident review to identify underlying cause, and preventive action to avoid a repeat in the future and post here on completion. We apologise for how you and your business may have been affected by this incident.
- postmortem Feb 20, 2025, 06:47 AM UTC
**Issue Summary** TechnologyOne observed system instability over the weekend in the spatial data layer after the regular storage maintenance by the upstream vendor and took measures to recover via a database failover on 16 February 2025. On 17 February 2025 a subset of spatial customers in the ANZ region were unable to access ConfigManager, Intramaps and Nearmaps. A portion of this customer base were restored by 2pm AEST with the remaining being brought online by 4pm. At 7pm AEST errors presented in the logs which identified further error states presenting. Action was taken to restore the affected databases from backup and all were restored by 11pm AEST. At 11.15pm AEST errors in replication were observed and found to be as a result of replication commencing at time restoration was underway. Access was restored and replications re-run with monitoring into the morning. **Root Cause** The database cluster became instable due to Continuous Availability \(CA\) configuration not being enabled at the spatial storage layer. This configuration prevents inconsistences for the database during storage maintenance. **Corrective Actions** A failover was conducted on the Sunday evening however many databases continued to enter recovery state. Corrected a storage permissions issue. Brought databases online one by one as each time attempted as a group failed. Restored subset of databases found to be in a corrupted state from back-up. **Preventative Actions** Enable CA on the database share on the primary storage for spatial. A maintenance window is planned between 8pm-9pm AEST on 22/2/2025. Updates will be provided via the status page.