Nango incident

Increase in 502s

Nango experienced a minor incident on April 27, 2026 affecting Nango Cloud Health, lasting 1d 20h. The incident has been resolved; the full update timeline is below.

Started: Apr 27, 2026, 12:16 PM UTC
Resolved: Apr 29, 2026, 08:25 AM UTC
Duration: 1d 20h
Detected by Pingoru: Apr 27, 2026, 12:16 PM UTC

Affected components

Nango Cloud Health

Update timeline

investigating Apr 27, 2026, 12:16 PM UTC

We are experiencing an increase in 502s on our public API. We believe we have found the cause and are preparing a release to resolve it.
investigating Apr 27, 2026, 12:16 PM UTC

We are experiencing an increase in 502s on our public API. We believe we have found the cause and are preparing a release to resolve it.
resolved Apr 27, 2026, 02:59 PM UTC

This has now been resolved.
resolved Apr 29, 2026, 08:25 AM UTC

Post-Incident Summary Date: 29 April 2026 Summary A gradual increase in Records API response payload size eventually exceeded the memory available to the pods serving those requests, causing them to be terminated. As pods were terminated and replaced, the load balancer returned 502 errors for requests routed to them. The growth had been building over several days and was not caught by internal monitoring before a customer reported the issue. Timeline (UTC) Issue began: 24 April, ~12:00 Reported by customer: 25 April, ~00:00 Investigation started: 27 April, ~07:00 Mitigated: 27 April, 15:00 Resolved: 27 April, 15:00 Root Cause The Records API response payload size had been growing gradually for several days. Once the payloads crossed the available pod memory headroom, the pods serving those requests began being terminated for exceeding memory limits. As terminated pods were replaced, the load balancer returned 502 errors for requests routed to pods that were shutting down or starting up. The growth was gradual rather than sudden, which meant our internal monitoring did not flag it before user-visible errors occurred. Diagnosis was also slowed because the default per-pod memory view smoothed out the underlying spikes, masking the memory-pressure pattern until the smoothing was removed and pod termination state was inspected directly. Resolution Per-pod memory limits on the affected service were increased, restoring headroom for the larger payloads and stopping the terminations. Once the new limits were in place, the load balancer stopped returning 502s and the service returned to normal. Follow-Up Actions Detection A monitor now pages the on-call team when load-balancer 5XX rates rise, so future user-impacting issues are caught immediately rather than relying on customer reports. System safeguards Bound peak memory per Records API request so response sizes cannot pressure pod memory. Improve telemetry around per-request response size so unusual growth surfaces in monitoring before it impacts users.