Onfido incident
Increase of withdrawn device intelligence reports
Onfido experienced a minor incident on September 23, 2025 affecting Device Intelligence and Device Intelligence, lasting 32m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- monitoring Sep 23, 2025, 10:08 AM UTC
We have rolled back a faulty deployment that caused device intelligence reports to be withdrawn. We are currently re-running them. Studio workflow runs that are in progress will be resumed once the device intelligence reports are re-run. This disruption lasted for 1 hour between 07:04 UTC and 08:06 UTC.
- resolved Sep 23, 2025, 10:40 AM UTC
All withdrawn reports were re-run. In progress workflow runs should be resumed now.
- postmortem Oct 06, 2025, 08:44 AM UTC
### Summary All Device Intelligence reports created on the 23rd of September 2025 between 06:59 and 08:06 UTC were withdrawn. Moreover, as a consequence the corresponding Studio Tasks did not complete during the incident. Once the root cause was identified and fixed, all withdrawn reports were re-processed and the results were delivered through Webhooks or can be fetched through the Public API. ### Root Causes A faulty deployment in the service that processes the results for this report/task type was released at 06:59 UTC. ### Timeline 06:59 UTC: The service that processes device intelligence reports starts to be deployed into production in all regions. 07:05 UTC: Deployment finishes with success 07:08 UTC: First responders are notified about a slight increase in error rate in Studio tasks 07:21 UTC: First responder starts looking into the issue, without clear indication the deployment was the cause of the error rate increase 07:50 UTC: We understand that 100% of device intelligence reports were withdrawn, showing a larger issue than just an error rate increase in Studio 08:06 UTC: Roll back the service that is responsible for processing this type of report completed and error rate recovers 09:05 UTC: Start preparing to reprocess withdrawn reports 09:44 UTC: Report reprocessing started 10:37 UTC: Report reprocessing finished ### Remedies * Improve the alerting in the service for this report type * Add additional e2e test coverage for the service