Redocly experienced a major incident on January 26, 2026, lasting 2h 25m. The incident has been resolved; the full update timeline is below.
Update timeline
- resolved Jan 26, 2026, 06:33 PM UTC
We've now resolved the incident. Thanks for your patience. Root Cause The outage was caused by a loss of quorum within our primary cluster management layer. As new instances were being rotated into the cluster, the management nodes experienced a sharp increase in resource utilization. This spike in load prevented the nodes from communicating effectively, leading to the loss of a cluster leader and a subsequent breakdown in task scheduling and service discovery.