Adapty incident

Elevated API Errors

Adapty experienced a major incident on September 10, 2025 affecting API and Dashboard, lasting 2h 4m. The incident has been resolved; the full update timeline is below.

Started: Sep 10, 2025, 07:51 PM UTC
Resolved: Sep 10, 2025, 09:56 PM UTC
Duration: 2h 4m
Detected by Pingoru: Sep 10, 2025, 07:51 PM UTC

Affected components

APIDashboard

Update timeline

investigating Sep 10, 2025, 07:51 PM UTC

We're experiencing an elevated level of API errors and are currently looking into the issue.
investigating Sep 10, 2025, 08:21 PM UTC

There are problems with the Cloudflare network in the UK; we're finding ways to mitigate them. Fallback paywalls are in place.
monitoring Sep 10, 2025, 09:26 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Sep 10, 2025, 09:56 PM UTC

This incident has been resolved.
postmortem Sep 11, 2025, 07:18 PM UTC

1. Summary From September 9 21:45 UTC to September 11 00:30 UTC, a major incident occurred in Cloudflare's infrastructure affecting Adapty services availability. During the incident, a total of 3.3% of clients were affected, primarily in Poland, the United Kingdom, the United States, and Germany. The issue was related to incorrect traffic routing to Cloudflare data centers. The incident was not related to Adapty's internal infrastructure. 2\. Timeline | Time Stamp | Event | | --- | --- | | 9.09 19:40 UTC | Incident start, first alerts received | | 9.09 19:40 UTC | Automated checks detected [api.adapty.io](http://api.adapty.io/) unavailability | | 9.09 19:41 UTC | Incident escalated to S1, on-call team engaged | | 9.09 19:45 UTC | Retransmissions detected on front nodes, CF confirmed as source | | 9.09 20:00 UTC | Diagnostics showed routing through London Heathrow \(LHR\) for problematic UK traffic | | 9.09 20:41 UTC | Cache was disabled in the Cloudflare CDN. | | 10.09 21:13 UTC | Cloudflare proxying was disabled. Traffic was routed directly to the origin. | | 10.09 21:30 UTC | Issue partially resolved, reduction in 503 errors | | 11.09 21:35 UTC | Full service restoration, incident end | 3\. Three whys * **Why did users receive 503 errors?** - Cloudflare edge servers couldn't process incoming connections. * **Which became the main reason for the network by the Cloudflare?** - Incorrect traffic routing led to overload of specific CF data centers \(especially LHR\). * **Why couldn't we prevent user impact?** - Dependency on external CDN provider without ability to quickly switch to alternative solutions. 4\. Action items * Implement monitoring of CF edge status codes with alerts when thresholds are exceeded * Develop procedure for quick DNS switching to direct IPs in case of CF issues * Research multi-CDN setup possibilities for critical endpoints * Document CF issue diagnostics procedures \(cdn-cgi/trace, colo codes\) **We're sorry and are working hard to implement solutions to prevent this in the future.**