Onfido incident

Delays in Webhook delivery in EU region

Major Resolved View vendor source →

Onfido experienced a major incident on September 17, 2025 affecting Webhooks, lasting 41m. The incident has been resolved; the full update timeline is below.

Started
Sep 17, 2025, 08:34 PM UTC
Resolved
Sep 17, 2025, 09:16 PM UTC
Duration
41m
Detected by Pingoru
Sep 17, 2025, 08:34 PM UTC

Affected components

Webhooks

Update timeline

  1. investigating Sep 17, 2025, 08:34 PM UTC

    We are currently investigating this issue.

  2. monitoring Sep 17, 2025, 08:52 PM UTC

    The issue has been resolved, webhook delivery is back to normal. We continue to monitor.

  3. resolved Sep 17, 2025, 09:16 PM UTC

    This incident has been resolved.

  4. postmortem Sep 22, 2025, 01:05 PM UTC

    # Summary Webhook event deliveries in the EU region faced delays of up to 22 mins between 18:44 to 20:44 UTC, following an extreme level of webhook notification request timeouts. This necessitated scaling up the relevant service to mitigate. # Root Causes There was an unusual amount of webhook notification request timeouts which saturated the resources available to process webhooks. Our scale up strategy was not setup correctly to handle that particular situation. # Timeline 17:34 UTC: There began to be a large increase in request timeout to specific customer webhook endpoints 17:46 UTC: We get notified with modest abnormal webhook latency for some events 18:04 UTC: We scaled up the service’s resources, latency was stabilised 18:44 UTC: Webhook latency began to degrade significantly on all events 20:20 - 20:40 UTC: Further adjustments made to the service's autoscaling configuration 20:44 UTC: Incident fully recovered, including clearing of any queued requests # Remedies We updated our auto-scaling configuration. Follow up work planned to improve handling of abnormal request timeouts.