Lightdash incident

Intermittent API errors across US

Major Resolved View vendor source →

Lightdash experienced a major incident on March 24, 2026 affecting Lightdash Cloud (US), lasting 1h 43m. The incident has been resolved; the full update timeline is below.

Started
Mar 24, 2026, 10:35 AM UTC
Resolved
Mar 24, 2026, 12:19 PM UTC
Duration
1h 43m
Detected by Pingoru
Mar 24, 2026, 10:35 AM UTC

Affected components

Lightdash Cloud (US)

Update timeline

  1. identified Mar 24, 2026, 10:35 AM UTC

    We have identified an issue in our network configuration that is leading to some requests failing. We've identified fix and are deploying to production.

  2. identified Mar 24, 2026, 10:35 AM UTC

    We are continuing to work on a fix for this issue.

  3. identified Mar 24, 2026, 10:36 AM UTC

    We are continuing to work on a fix for this issue.

  4. monitoring Mar 24, 2026, 11:34 AM UTC

    We have deployed a fix to our network configuration and all services are healthy. We will continue monitoring.

  5. resolved Mar 24, 2026, 12:19 PM UTC

    We've seen no errors related to network configuration.

  6. postmortem Mar 24, 2026, 12:21 PM UTC

    **Root cause** Our cloud networking layer had insufficient capacity to handle the volume of outbound connections from our services. As traffic grew, we periodically exceeded the available connection limits, causing some requests to be dropped. **What we fixed** We restructured our internal network architecture, significantly reducing the load on our outbound connection layer. This freed up capacity. ‌ **Ensuring it won’t happen again** We’ve permanently re-architected to prevent this specific issue. We’ve added monitoring \+ alerts to catch similar network errors.