Fasterize incident

Acceleration is disabled.

Fasterize experienced a critical incident on June 11, 2025 affecting Acceleration, lasting 3h 8m. The incident has been resolved; the full update timeline is below.

Started: Jun 11, 2025, 01:41 PM UTC
Resolved: Jun 11, 2025, 04:50 PM UTC
Duration: 3h 8m
Detected by Pingoru: Jun 11, 2025, 01:41 PM UTC

Affected components

Acceleration

Update timeline

investigating Jun 11, 2025, 01:41 PM UTC

We currently have some issues on our european infrastructure. We're investigating. Speeding-up is disable. Traffic should be redirected to origins.
identified Jun 11, 2025, 02:02 PM UTC

Issue has been identified. Some load balancers in one of our datacenter have crashed and have been restarted. Traffic was interrupted at 3:31PM and is back to normal since 3:42PM CET.
monitoring Jun 11, 2025, 03:02 PM UTC

We're still monitoring.
resolved Jun 11, 2025, 04:50 PM UTC

Incident is now closed. Sorry for the inconvenience. A post-mortem will follow.
postmortem Jun 12, 2025, 10:29 AM UTC

## **Summary** On June 12 2025, Fasterize experienced a service disruption for a subset of customers due to a large-scale DDoS attack. From 15:31 to 15:42 \(UTC\+2\) the surge in traffic overloaded the zone’s load balancers, resulting in a temporary fallback to origin servers for affected sites. Total outage time was 11 minutes, with full platform stability restored at 15:42. ## **Timeline \(UTC\+2\)** * **15:31 – 15:42**: DDoS attack peaks at ~8 million requests overall, spiking to 50 000 req/s. Load balancers in the zone restart repeatedly. * **15:34**: Internal alerts detect the saturation \(detection time: 3 min\). Health-check system routes traffic directly to origin servers. * **15:42:** Traffic normalises; services in the affected zones recover \(time to full resolution: 11 min\). ## **What Happened** The volumetric DDoS flooded the load balancers, driving CPU utilisation to 100 %. Although our automatic health-check mechanism successfully redirected traffic to customer origin servers \(for those customers' whose stack permits this automatic redirection\), the repeated load-balancer restarts caused an 11-minute service degradation for sites relying on that zone. We had increased load-balancer capacity after a similar event on May 15, but this proved insufficient for the size of yesterday’s attack, highlighting a design limitation in relying on provider-managed load balancers. ‌ ## Impact * **Severity level:** 1 — Unplanned downtime affecting multiple production websites * **Detection time:** 3 minutes * **Time to full recovery:** 11 minutes During the incident, visitors to impacted sites experienced 502 errors until they were served directly from origin infrastructure, potentially experiencing higher latency. ## **What We're Doing** ### Short-term improvements \(in progress\) * Re-architecture to avoid load balancer issues ### Long-term improvements * **Additional zones:** Bringing online new zones to spread traffic and isolate attacks more effectively. * **Rate-limit hardening:** Evaluating two third-party CDNs that offer native volumetric-attack rate limiting.