Lakeside Software incident
SysTrack Cloud – UK – Availability Issues
Lakeside Software experienced a major incident on March 19, 2024 affecting SysTrack API/UI and SysTrack Endpoint Connections, lasting 1d 12h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Mar 19, 2024, 12:59 PM UTC
The issue has been identified on our Cloud Provider end and we are working with them to fix the issue as soon as possible.
- identified Mar 19, 2024, 01:52 PM UTC
We have been experiencing a wide spread issue with a Microsoft managed service (application gateway) which is the load balancer used for the SysTrack Cloud product. The issues started yesterday around 7AM EST in one region, and was resolved temporarily, but continued to be a problem as of today around 12AM EST. The issue is affecting multiple Azure regions. We are working closely with Microsoft to resolve the issue and will provide updates hourly to the cloud status page as usual.
- identified Mar 19, 2024, 02:55 PM UTC
We are continuing to work closely with Microsoft to resolve the issue and will provide updates hourly to the cloud status page as usual.
- identified Mar 19, 2024, 04:01 PM UTC
We are continuing to work closely with Microsoft to resolve the issue and will provide updates hourly to the cloud status page as usual.
- identified Mar 19, 2024, 05:23 PM UTC
We are continuing to work closely with Microsoft to resolve the issue and will provide updates hourly to the cloud status page as usual.
- identified Mar 19, 2024, 06:24 PM UTC
We are continuing to work closely with Microsoft to resolve the issue and will provide updates hourly to the cloud status page as usual.
- identified Mar 19, 2024, 08:01 PM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 19, 2024, 08:58 PM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 19, 2024, 09:58 PM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 20, 2024, 12:22 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 20, 2024, 02:26 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 20, 2024, 03:21 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence
- identified Mar 20, 2024, 04:51 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 20, 2024, 06:00 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 20, 2024, 07:13 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence.
- identified Mar 20, 2024, 08:12 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence
- identified Mar 20, 2024, 08:14 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence
- identified Mar 20, 2024, 08:23 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence
- identified Mar 20, 2024, 09:05 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence
- identified Mar 20, 2024, 09:08 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence
- identified Mar 20, 2024, 10:13 AM UTC
In cooperation with Microsoft, we have identified the root cause of the problem. The issue resulted from a recent change to Microsoft's environment. We are working closely with Microsoft to restore all services and mitigate the risk of any reoccurrence
- monitoring Mar 20, 2024, 10:36 AM UTC
Microsoft has remediated the problem. All services are now operational. We will continue to monitor the situation and provide updates approximately every 6 hours
- resolved Mar 21, 2024, 01:43 AM UTC
This incident has been resolved.
- postmortem Apr 04, 2024, 01:58 AM UTC
# What was the issue? All customers experienced intermittent problems loading the SysTrack CE for a period of time. Agents as well as users of the SysTrack web product were unable to connect for periods of time: # What was the root cause? This is a preliminary RCA that is subject to change once we get the final RCA from Microsoft. Thus far, we have determined the root cause to be with the Microsoft Azure’s Application Gateway. This managed service, fully supported by Microsoft, appears to have had a backend update which was deployed by Microsoft to the different SysTrack regions. This updated version does not appear to handle our unique workload on the Application Gateway. The managed service got overloaded and caused it to go into an _unavailable_ state thus not accepting any inbound traffic. # What is the Prevention Strategy? We are still working with our cloud provider \(Azure\) to get a full RCA before we prepare an identification and prevention strategy.