TechnologyOne incident

Spatial Services unavailable for a subset of customers - ANZ Region / All Releases

Major Resolved View vendor source →

TechnologyOne experienced a major incident on February 16, 2025 affecting Spatial Cloud, lasting 11h 37m. The incident has been resolved; the full update timeline is below.

Started
Feb 16, 2025, 08:55 PM UTC
Resolved
Feb 17, 2025, 08:33 AM UTC
Duration
11h 37m
Detected by Pingoru
Feb 16, 2025, 08:55 PM UTC

Affected components

Spatial Cloud

Update timeline

  1. investigating Feb 16, 2025, 08:55 PM UTC

    Our team is troubleshooting an issue impacting Spatial Services for ANZ Region / ALL Releases. Errors are presenting for IntraMaps and Config Manager. Due to the investigation, the next update will be provided in 60 minutes, or sooner if new information becomes available.

  2. identified Feb 16, 2025, 09:50 PM UTC

    Our team has identified a fix which is being implemented now. We anticipate the implementation of the fix for all impacted customers to take up to 4 hours to complete and verify.

  3. identified Feb 16, 2025, 11:20 PM UTC

    Our team has restored Spatial Services to a number of impacted customers across the ANZ region. We are continuing to apply the fix to the remaining customers. We will provide another update in 90 minutes, or sooner if new information becomes available.

  4. identified Feb 17, 2025, 12:59 AM UTC

    We are continuing to apply a fix to the remaining impacted customers. We will provide another update in 60 minutes, or sooner if new information becomes available.

  5. identified Feb 17, 2025, 02:00 AM UTC

    We are continuing to apply a fix to the remaining impacted customers. As our incident has progressed we have found 2 separate issues. The first issue has been resolved and impacted customers restored. For the remaining customers impacted this relates to a 2nd issue which we are continuing to work through applying a fix for and we anticipate this will take 4 hours to restore the remaining customers.

  6. identified Feb 17, 2025, 04:02 AM UTC

    We are continuing to apply a fix to the remaining impacted customers. We will provide another update in 60 minutes, or sooner if new information becomes available.

  7. identified Feb 17, 2025, 04:56 AM UTC

    Our team has restored Spatial Services to most of the impacted customers across the ANZ region. We are continuing to apply a fix to the remaining customers. We will provide another update in 60 minutes, or sooner if new information becomes available.

  8. monitoring Feb 17, 2025, 06:25 AM UTC

    Our team has verified the implementation of a fix is completed for all impacted customers now. As we have brought each of the Spatial Services back online, we have noted some performance degradation for customers due to timeouts for those not yet online. We will monitor the logs for the next 2 hours to ensure no further customers are impacted.

  9. resolved Feb 17, 2025, 08:33 AM UTC

    After 2 hours monitoring this incident is now resolved. We will undertake a Post Incident Review and findings will be posted here on completion. We apologise for how you and your business may have been affected by this incident.

  10. postmortem Feb 20, 2025, 06:47 AM UTC

    **Issue Summary** TechnologyOne observed system instability over the weekend in the spatial data layer after the regular storage maintenance by the upstream vendor and took measures to recover via a database failover on 16 February 2025. On 17 February 2025 a subset of spatial customers in the ANZ region were unable to access ConfigManager, Intramaps and Nearmaps. A portion of this customer base were restored by 2pm AEST with the remaining being brought online by 4pm. At 7pm AEST errors presented in the logs which identified further error states presenting. Action was taken to restore the affected databases from backup and all were restored by 11pm AEST. At 11.15pm AEST errors in replication were observed and found to be as a result of replication commencing at time restoration was underway. Access was restored and replications re-run with monitoring into the morning. **Root Cause** The database cluster became instable due to Continuous Availability \(CA\) configuration not being enabled at the spatial storage layer. This configuration prevents inconsistences for the database during storage maintenance. **Corrective Actions** A failover was conducted on the Sunday evening however many databases continued to enter recovery state. Corrected a storage permissions issue. Brought databases online one by one as each time attempted as a group failed. Restored subset of databases found to be in a corrupted state from back-up. **Preventative Actions** Enable CA on the database share on the primary storage for spatial. A maintenance window is planned between 8pm-9pm AEST on 22/2/2025. Updates will be provided via the status page.