LMS365 incident

Issues accessing Learn365

Notice Resolved View vendor source →

LMS365 experienced a notice incident on July 19, 2024 affecting Learn365 United States, lasting 7h 41m. The incident has been resolved; the full update timeline is below.

Started
Jul 19, 2024, 07:15 AM UTC
Resolved
Jul 19, 2024, 02:56 PM UTC
Duration
7h 41m
Detected by Pingoru
Jul 19, 2024, 07:15 AM UTC

Affected components

Learn365 United States

Update timeline

  1. investigating Jul 18, 2024, 11:13 PM UTC

    We are aware of an issue impacting user access to Learn365. We believe this is related to a Microsoft Outage. We will continue to investigate and triage. https://azure.status.microsoft/en-us/status

  2. investigating Jul 18, 2024, 11:19 PM UTC

    We are continuing to investigate this issue.

  3. investigating Jul 18, 2024, 11:24 PM UTC

    We are continuing to investigate this issue.

  4. identified Jul 19, 2024, 02:42 AM UTC

    Microsoft has identified the underlying issues with Azure services and is currently working to deploy a fix. We anticipate Learn365 services will be restored once Azure services are fully operational. We encourage those impacted to follow along with the status updates posted below. https://azure.status.microsoft/en-us/status

  5. monitoring Jul 19, 2024, 07:15 AM UTC

    As per Microsoft: "We have determined that a backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We have confirmed mitigation for Azure Storage and Compute resources. We are continuing to investigate and mitigate a number of downstream impacted services that have not returned to a fully healthy state." We can confirm that customers from the Central US database region can access Learn365 now, as the first successful requests were recorded by our monitoring system at 04:19 AM UTC (12:19 AM EST). We continue to monitor the situation until it is fully resolved.

  6. resolved Jul 19, 2024, 02:56 PM UTC

    As per Microsoft: "Services which were impacted by this outage recovered progressively and engineers from the respective teams intervened where further manual recovery was needed. Following an extended monitoring period, we determined that impacted services had returned to their expected availability levels. " We’re happy to report the issue has now been resolved. Thank you for your patience and please let us know if you see any further issues.

  7. postmortem Jul 24, 2024, 03:28 AM UTC

    On Thursday July 18th, Learn365 availability was impacted by a Microsoft Azure outage. The outage was exclusive to the Central US region. When we initially updated our Zensai Status Page, we reported the outage impacted the North Europe region as well. However, this was a clerical error on our part. We apologize for any confusion this may have caused. For specifics on the technical background of the outage's root cause, please refer to the direct quotes from the Microsoft team below. ”We have determined that a backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We have confirmed mitigation for Azure Storage and Compute resources. We are continuing to investigate and mitigate a number of downstream impacted services that have not returned to a fully healthy state." ”Services which were impacted by this outage recovered progressively and engineers from the respective teams intervened where further manual recovery was needed. Following an extended monitoring period, we determined that impacted services had returned to their expected availability levels. "