Recorded Future incident

Recorded Future Portal and API Disruption

Recorded Future experienced a critical incident on April 15, 2025 affecting User Interface and API, lasting 38m. The incident has been resolved; the full update timeline is below.

Started: Apr 15, 2025, 03:10 PM UTC
Resolved: Apr 15, 2025, 03:48 PM UTC
Duration: 38m
Detected by Pingoru: Apr 15, 2025, 03:10 PM UTC

Affected components

User InterfaceAPI

Update timeline

investigating Apr 15, 2025, 03:10 PM UTC

Dear Customer, We are currently experiencing a service disruption with the Recorded Future application portal and API. Our product operations team are actively working on restoring accessibility to an optimal level of service. We will continue to provide updates as they become available. Please contact our support team at [email protected] if you have any questions. Regards, Recorded Future Platform Operations
investigating Apr 15, 2025, 03:24 PM UTC

We are continuing to investigate this issue.
resolved Apr 15, 2025, 03:48 PM UTC

Dear Customer, We can now consider the service disruption with our portal and API to be resolved, and a root cause analysis will be produced for this incident. Please contact our support team at [email protected], or your Intelligence Services resource if you have any questions. Regards, Recorded Future Platform Operations
postmortem Apr 26, 2025, 07:17 PM UTC

Dear Customer, Below is our post mortem to the incident in question from April 15, 2025: * **Issue:** * At approximately 14:52 UTC on April 15th, monitoring alerts indicated that both the Portal and API were unresponsive to customers. Following initial investigations, the incident status was escalated to a “Major Outage” for the “Portal” and “API” components on [status.recordedfuture.com](http://status.recordedfuture.com) at around 15:10 UTC. * **Cause:** * The incident was traced to the main authorization service, which is a critical dependency for backend request handling. One instance within the service group entered an unresponsive state but continued to accept and process threads. This led to a situation where most incoming authentication requests to this service appeared to hang. As a result, dependent services—including the Portal and API—were perceived as unresponsive from the end-user perspective for a short period of time. * **Remediation / Path Forward:** * Corrections were made to rebalance traffic and service requests to restore normal service, and the issue was declared fully resolved at approximately 15:13 UTC. * To prevent the recurrence of this issue, the following steps are being undertaken: * Enhancing the authorization service to detect and isolate non-responsive instances automatically. * Introducing safeguards to prevent thread buildup when an instance becomes unresponsive.