StrongDM incident

Service Degradation on US Control Plane.

StrongDM experienced a critical incident on December 12, 2025 affecting Admin UI and API, lasting 1h 22m. The incident has been resolved; the full update timeline is below.

Started: Dec 12, 2025, 12:49 AM UTC
Resolved: Dec 12, 2025, 02:12 AM UTC
Duration: 1h 22m
Detected by Pingoru: Dec 12, 2025, 12:49 AM UTC

Affected components

Admin UIAPI

Update timeline

investigating Dec 12, 2025, 12:49 AM UTC

We are currently investigating the issue. We will update here as we know more.
monitoring Dec 12, 2025, 02:10 AM UTC

We have identified an issue causing performance issues in our product DB. We are monitoring and a more detailed update will follow.
resolved Dec 12, 2025, 02:12 AM UTC

The incident has been resolved and we will provide an RCA within the next 7 days.
postmortem Dec 12, 2025, 07:58 PM UTC

**Summary** On December 11th 2025, StrongDM experienced a service degradation affecting the US Control Plane. The issue was caused by resource exhaustion from an internal monitoring component and lasted approximately 73 minutes before service was fully restored. **What Happened** An internal monitoring component configured to collect operational telemetry was unable to process data efficiently under US production load. This led to resource exhaustion on the Control Plane. UK and EU Control Planes were not affected due to lower total throughput volumes. **Resolution** Infrastructure rolled back the monitoring component's configuration, immediately relieving resource pressure and restoring normal service. **Prevention & Remediation** To prevent recurrence, StrongDM is updating testing processes to better catch issues like this and revising configuration review processes for internal tooling updates.