Limble CMMS incident

Users unable to access the limble applications

Critical Resolved View vendor source →

Limble CMMS experienced a critical incident on April 26, 2024 affecting Limble CMMS Web Application, lasting 52m. The incident has been resolved; the full update timeline is below.

Started
Apr 26, 2024, 08:43 AM UTC
Resolved
Apr 26, 2024, 09:35 AM UTC
Duration
52m
Detected by Pingoru
Apr 26, 2024, 08:43 AM UTC

Affected components

Limble CMMS Web Application

Update timeline

  1. investigating Apr 26, 2024, 08:43 AM UTC

    Currently we are aware an an issue where limble applications appear to be down. We are investigating.

  2. monitoring Apr 26, 2024, 09:05 AM UTC

    We have implemented a fix to address this issue and are monitoring the results.

  3. resolved Apr 26, 2024, 09:35 AM UTC

    This incident has been resolved.

  4. postmortem May 03, 2024, 02:50 PM UTC

    **Date**: April 26, 2024 **Status**: Resolved # Summary On April 26th, 2024, our service experienced a downtime event due to an SSL certificate failure, affecting most of our customers from approximately 2:20 AM to 3:00 AM MST. The issue stemmed from recent changes in the rules around SSL certificate auto-renewal by our SSL provider, which were not fully accounted for in our system. As a result, we have enhanced our alerting systems and updated our infrastructure configurations to prevent similar issues in the future. # Impact The downtime affected all users attempting to access our site without a cached version of the SSL certificate, resulting in about 40 minutes of service disruption. # Root Causes The primary cause of the downtime was a change in the auto-renewal rules for SSL certificates by our provider, which led to an unexpected, silent expiration. Although a fix was previously developed, it was not yet fully propagated across the entire Limble infrastructure. # Resolution and Improvements Immediate actions were taken to renew the SSL certificate and mitigate the situation. Following the incident, we fully integrated the auto-renewal fix and enhanced our monitoring and alerting capabilities to detect similar issues ahead of potential disruptions. # Description of Events: * **2:18 AM MST:** SSL certificate for [limblecmms.com](http://forlimblecmms.com/) and some of its subdomains. * **2:20 AM MST:** Service disruption detected. * **2:25 AM MST:** The on-call team was alerted to the issue. * **2:33 AM MST:** Critical alert alarm raised. * **2:44 AM MST:** Official incident declaration posted on our status page. * **3:03 AM MST:** New SSL certificate procured and implemented; service restored.