ServiceChannel incident

Work Orders issue - displaying inconsistent results

ServiceChannel experienced a major incident on September 25, 2025 affecting Work Order Manager, lasting 4h 15m. The incident has been resolved; the full update timeline is below.

Started: Sep 25, 2025, 01:07 PM UTC
Resolved: Sep 25, 2025, 05:22 PM UTC
Duration: 4h 15m
Detected by Pingoru: Sep 25, 2025, 01:07 PM UTC

Affected components

Work Order Manager

Update timeline

investigating Sep 25, 2025, 01:07 PM UTC

We are currently investigating this issue.
identified Sep 25, 2025, 02:08 PM UTC

The issue has been identified and a fix is being implemented.
identified Sep 25, 2025, 04:17 PM UTC

We are continuing to work on a fix for this issue.
monitoring Sep 25, 2025, 05:15 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Sep 25, 2025, 05:22 PM UTC

This incident has been resolved.
postmortem Oct 10, 2025, 07:23 PM UTC

**Incident Report: Database Index Degradation result in Work Order Issues** **Date of Incident:** 09/25/2025 **Time/Date Incident Started:** 09/25/2025, 9:07 AM EDT **Time/Date Stability Restored:** 09/25/2025, 1:15 pm EDT **Time/Date Incident Resolved:** 09/25/2025, 1:22 pm EDT **Users Impacted:** Some **Frequency:** Intermittent **Impact:** Major **Incident description:** On September 25, 2025, Service Channel investigated a report of Work Order notes not displaying. Some users experienced intermittent issues with note visibility, including with newly created notes; however, all data was securely saved. Partial service resumed at 12:23 PM EDT, with full functionality restored by 1:15 PM EDT. **Root Cause Analysis:** The root cause of the incident was traced to performance degradation affecting two key indexes in the database responsible for managing Work Order notes. The indexes had degraded to a state where data retrieval queries became significantly inefficient. Consequently, application requests to display notes were timed out before the necessary data could be retrieved successfully. This directly led to the intermittent symptoms experienced by some users, including the failure to load existing notes and the non-display of newly submitted notes. **Actions Taken:** * The SRE team declared an incident and additional engineering resources were engaged as the issue was confirmed to be impacting at least 2 clients. * Initial remediation efforts included restarting application services and rolling back a recent code release, neither of which resolved the issue. * The investigation found database index degradation as the root cause, and rebuilding of the two indexes began. * Concurrently, a code-based mitigation was implemented to circumvent the inefficient queries, serving as an alternative resolution method. * At 12:23 PM EDT, the code mitigation was deployed, and the first index rebuild was completed, which restored partial functionality for users. * Full-service stability was restored at 1:15 PM EDT upon the successful completion of the second index rebuild. **Mitigation Measures:** * The planned migration of the Work Order notes database to a new platform has been successfully completed. This new environment offers enhanced monitoring and management tools that will help prevent and more quickly resolve index-related performance issues in the future. * The code-based mitigation deployed during the incident has been permanently implemented. This change optimizes the logic for retrieving notes, making the application more resilient to potential database performance fluctuations.