Seravo experienced a major incident on January 28, 2026 affecting fi-coltrane cluster and fi-ellington cluster and 1 more component, lasting 4h 38m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Jan 28, 2026, 05:28 AM UTC
Theres an availability issues in multiple clusters which affect small amount of sites. We have identified the problem and working on a solution at the moment.
- identified Jan 28, 2026, 06:57 AM UTC
We are continuing to work on a fix for this issue.
- monitoring Jan 28, 2026, 08:57 AM UTC
A fix has been implemented and we are monitoring the results.
- resolved Jan 28, 2026, 10:07 AM UTC
This incident has been resolved.
- postmortem Jan 30, 2026, 01:03 PM UTC
# Notice of a Service Disruption on January 28, 2026 On Wednesday January 28, 2026, a failure related to the Nginx web server occurred in several Seravo server clusters. The disruption was caused by an improvement on our security features and a minor change in Nginx configurations. This caused an outage to a number of sites in the clusters. The problem was noticed immediately by our on-call monitoring processes, and was fixed as quickly as possible.Technical complications during the roll-back delayed the full restoration of site services. The disruption began at 07:00 \(UTC\+2\) and ended at 10:00. We apologise for any inconvenience caused by the disruption. ## Timeline \(all timestamps UTC\+2 \(EET\) * 28.1.2026 07:00 Nginx misconfiguration was rolled-out * 28.1.2026 07:06 Site monitoring noticed site issues and alerted on-call officer * 28.1.2026 07:10 Investigation was started * 28.1.2026 07:21 The root cause was identified * 28.1.2026 07:21 Problem was escalated to the systems administration team * 28.1.2026 07:34 The team started fixing the issue * 28.1.2026 07:49 Some of the sites were recovered * 28.1.2026 09:06 Almost the whole cluster was fixed * 28.1.2026 10:00 All affected sites were back online ## Follow-Up Action As a result of the incident, we at Seravo have identified the need for the following measures: * Further development of internal processes and tools to enable faster troubleshooting and recovery. * Improvements to automated testing for configuration changes.