Onfido incident

Increased latency on check creation

Minor Resolved View vendor source →

Onfido experienced a minor incident on March 31, 2025 affecting Document Verification, lasting 21m. The incident has been resolved; the full update timeline is below.

Started
Mar 31, 2025, 09:41 PM UTC
Resolved
Mar 31, 2025, 10:03 PM UTC
Duration
21m
Detected by Pingoru
Mar 31, 2025, 09:41 PM UTC

Affected components

Document Verification

Update timeline

  1. investigating Mar 31, 2025, 09:41 PM UTC

    We are currently experiencing an issue that is negatively impacting latency on check completion.

  2. monitoring Mar 31, 2025, 09:43 PM UTC

    The issue has been resolved and we are monitoring the results.

  3. resolved Mar 31, 2025, 10:03 PM UTC

    This incident has been resolved. A small backlog of manual tasks will be cleared within the next 1-2 hours.

  4. postmortem Apr 09, 2025, 06:49 AM UTC

    ### Summary One of our components contributing to automatic processing for Document Reports had a spike of timeout errors from 9.05pm until 9.20pm in the EU cluster. All Document Reports created between 9:20pm and 9:40pm UTC were processed with a higher TaT by manual analysts. ### Root Causes Two faulty nodes in our production cluster temporarily slowed down the execution of a CPU intensive component. ### Timeline _9:21pm UTC: Elevated error rates for the relevant component trigger an on-call alert._ _9:28pm UTC: We identified two nodes of our cluster as culprits for slow CPU intensive executions._ _9:33pm UTC: Restart the two nodes._ _9:40pm UTC: The affected component recovers successfully._ _9:41pm UTC: Backlog of reports observed. Public incident raised to inform customers of expected time to clear._