Alpaca incident

High DB connection usage creating intermittent API failure

Alpaca experienced a critical incident on October 7, 2025 affecting Account API and Transfers and 1 more component, lasting 13h 52m. The incident has been resolved; the full update timeline is below.

Started: Oct 07, 2025, 11:05 PM UTC
Resolved: Oct 08, 2025, 12:57 PM UTC
Duration: 13h 52m
Detected by Pingoru: Oct 07, 2025, 11:05 PM UTC

Affected components

Account APITransfersOrders APIPositions APIDashboardAssets APITrade Update Streaming

Update timeline

investigating Oct 07, 2025, 11:05 PM UTC

We are seeing high DB usage causing the API to fail. We are checking internally.
investigating Oct 07, 2025, 11:20 PM UTC

We are still working on identifying the underlying cause of DB connections
investigating Oct 07, 2025, 11:22 PM UTC

We are continuing to investigate this issue.
investigating Oct 07, 2025, 11:26 PM UTC

We are restarting the DB and API calls are expected to fail.
investigating Oct 07, 2025, 11:42 PM UTC

Team is still working on the recovery.
investigating Oct 07, 2025, 11:54 PM UTC

Database is restarted and team is monitoring it
investigating Oct 08, 2025, 12:27 AM UTC

After restarting, the database connections are under control. We are monitoring the system.
investigating Oct 08, 2025, 01:07 AM UTC

We are seeing spike in connections again. Teams are actively working on it
investigating Oct 08, 2025, 01:28 AM UTC

System is working as expected. We are monitoring the performance
monitoring Oct 08, 2025, 01:37 AM UTC

We will continue to monitor the system and take appropriate the action.
monitoring Oct 08, 2025, 01:37 AM UTC

We are continuing to monitor for any further issues.
monitoring Oct 08, 2025, 01:45 AM UTC

We have identified and resolved the issue that was affecting our systems. Since implementing the fix, all systems have been operating normally with no recurrence of the problem. We have been closely monitoring our systems over the past hour, and all indicators show stable performance. Our monitoring infrastructure continues to track system health to detect any potential issues early. Our engineering team remains available to respond immediately if any concerns arise.
resolved Oct 08, 2025, 12:57 PM UTC

This incident has been resolved.
postmortem Oct 08, 2025, 02:01 PM UTC

High usage of database connections was identified around 6:00 PM ET. The team was able to identify an application process that was creating contention due to a lock wait, which persisted even after the request timed out. Both the application and the database were restarted to recover. The issue persisted intermittently but eventually resolved.