Entitle experienced a notice incident on March 26, 2025 affecting Entitle Portal and API, lasting 2m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Mar 26, 2025, 03:01 PM UTC
Application Not working
- resolved Mar 26, 2025, 03:03 PM UTC
This incident has been resolved.
- postmortem Mar 27, 2025, 10:27 AM UTC
**Post-Mortem Report: System Downtime on 03/27/2025** **Incident Summary:** During the deployment of a hotfix to the production environment, the system experienced unexpected downtime due to database locks. The downtime lasted for approximately 10 minutes, affecting all users attempting to access Entitle. **Root Cause Analysis:** The deployment process triggered database migrations, and during execution, certain queries caused locks on the database. These locks resulted in database transactions being blocked, leading to system unresponsiveness and eventual downtime. **Impact:** * Users were unable to access the system during the downtime window. **Resolution:** * The locks were cleared, and the system was restored from downtime. **Action Items:** 1. **Enforce a Deployment Check:** Ensure that deployments proceed only after all database migrations have been successfully executed to prevent locking issues. 2. **Implement a New Deployment Strategy:** Build and use an improved release deployment strategy to minimize the risk of downtime. 3. **Enhance Monitoring & Alerts:** Set up real-time monitoring and alerting mechanisms to detect database locks early and act proactively. 4. **Improve Rollback Mechanism:** Establish a seamless rollback mechanism to restore the previous stable state quickly in case of deployment issues. By implementing these action items, we aim to enhance deployment reliability and prevent similar incidents in the future.