Lakeside Software incident

SysTrack America's Cloud – Reports of Delayed Performance

Minor Resolved View vendor source →

Lakeside Software experienced a minor incident on April 11, 2023 affecting SysTrack API/UI, lasting 2h 57m. The incident has been resolved; the full update timeline is below.

Started
Apr 11, 2023, 05:27 PM UTC
Resolved
Apr 11, 2023, 08:25 PM UTC
Duration
2h 57m
Detected by Pingoru
Apr 11, 2023, 05:27 PM UTC

Affected components

SysTrack API/UI

Update timeline

  1. identified Apr 11, 2023, 06:27 PM UTC

    Some slowness when using the SysTrack Cloud UI or while making API calls. We're have identified the root cause. We apologize for any inconvenience.

  2. monitoring Apr 11, 2023, 06:29 PM UTC

    We have remediated the root cause and actively monitoring the situation.

  3. resolved Apr 11, 2023, 08:25 PM UTC

    We have identified the root cause, implemented a fix, and all systems have been fully restored. We will continue to closely monitor all services, but if you have any issues, please contact Lakeside Support at [email protected].

  4. postmortem Apr 19, 2023, 01:56 PM UTC

    # What was the issue? Some clients experienced intermittent slowness or 500 errors when accessing the SysTrack Website or APIs. # What was the root cause? The root cause for this situation was different for each issue: * High CPU on Internal Service: An internal service was being overloaded and not scaling fast enough to support the load. This took down other services on shared infrastructure * To resolve, additional capacity has been left at the minimum for the service until an application fix can be put into place. The application architecture update is currently being designed. The implementation timeline is TBD. * Azure Managed Service: Once we increased the capacity on various services, an azure service started to have intermittent failures. * To resolve, we scaled down the 1 service that had the issue. * We are still actively working with Microsoft to fully identify the root cause of the issue. # What is the Prevention Strategy? * High CPU on internal Service * Additional alerting on internal shared resources * Review of internal services that share capacity * Azure Managed Service: * TBD pending RCA with Microsoft