Alpaca incident

Degraded Performance: Instant Funding V1 API

Alpaca experienced a minor incident on December 15, 2025 affecting Instant Funding, lasting 5h 14m. The incident has been resolved; the full update timeline is below.

Started: Dec 15, 2025, 01:11 PM UTC
Resolved: Dec 15, 2025, 06:26 PM UTC
Duration: 5h 14m
Detected by Pingoru: Dec 15, 2025, 01:11 PM UTC

Affected components

Instant Funding

Update timeline

monitoring Dec 15, 2025, 01:11 PM UTC

Current Status: Monitoring & Verification Our engineering team has implemented a service restart and is actively monitoring all relevant metrics to confirm service stability and full recovery. Summary Starting at approximately 7:24 AM EST (12:24 UTC), clients attempting to use the v1/instant_funding API may have experienced elevated error rates, specifically HTTP 500 errors or Gateway Timeout messages. This issue was caused by a service timeout in our Payment Service when communicating with the GoBroker API. Impact Users attempting Instant Funding transactions were unable to complete them. This primarily affected partners utilizing the v1/instant_funding endpoint. Update The Payment Service and its worker components were restarted between 12:34 UTC and 12:40 UTC. Following the restart, service metrics indicate a return to normal request flow, and error rates have significantly decreased. We are currently monitoring the system and awaiting confirmation of full resolution from affected partners.
monitoring Dec 15, 2025, 01:53 PM UTC

we are processing large batches of Memo post , it can take few minutes please expect some delay for now. We are actively monitoring for any issue and will keep you posted.
monitoring Dec 15, 2025, 02:17 PM UTC

We are currently accepting new memo posts, but processing is slower than usual. Our team is actively monitoring the system for any issues, and we will provide updates as we make progress.
monitoring Dec 15, 2025, 02:31 PM UTC

Memo posts are currently flowing correctly, and we continue to monitor the system as a few items are still pending.
monitoring Dec 15, 2025, 02:54 PM UTC

We encountered a minor database issue while processing a large batch of memo posts. That issue has been fixed, and we are currently monitoring the system as the previously-stuck transfers are now being sequentially processed.
monitoring Dec 15, 2025, 03:57 PM UTC

All Memoposts have been processed and normal operations are resumed. we're sill actively monitoring instant funding activity to check for new issues or leftovers.
resolved Dec 15, 2025, 06:26 PM UTC

As we didnt observe any issue with respect to memopost since our last update hence marking it resolved here.
postmortem Dec 16, 2025, 04:05 PM UTC

## **Degraded Instant Funding API Performance \(December 15, 2025\)** ‌ ### **What Happened** Starting at approximately 7:24 AM EST \(12:24 UTC\) on December 15, 2025, clients using the Instant Funding API experienced an increase in transaction failures and gateway timeouts. Our initial investigation identified a technical issue where our internal payment processing service experienced significant delays when communicating with our core transaction engine. This was further complicated by an unexpected behavior in the internal ledger system that led to transaction processing issues, causing a backlog of pending funding requests. These concurrent issues collectively resulted in slow processing times and the eventual timeout errors visible to our partners. ### **Impact** We understand this event caused operational delays and deeply regret the inconvenience. * **Service Affected:** Instant Funding API \(specifically the `v1/instant_funding` endpoint\). * **User Experience:** Affected users attempting Instant Funding transactions were unable to complete them, resulting in visible HTTP 500 errors or gateway timeouts. * **Scope:** The issue primarily impacted partners using the Instant Funding API. At its peak, approximately **8,300 funding transfers** across multiple partners were delayed and queued for processing. * **Data Integrity:** **Crucially, we can confirm that no client funds or transaction data were lost or compromised.** All affected requests were successfully retained in a pending state and processed upon resolution. ### **Resolution** Our engineering teams were immediately mobilized and worked continuously to resolve the issue and process the accumulated backlog. 1. **Immediate Restoration:** The primary service involved in the funding flow was restarted to terminate open connections and locks held in wrongful states, which immediately mitigated the initial timeouts and allowed new requests to flow correctly. 2. **Addressing Core Instability:** Our internal core transaction engine was identified as contributing to the issue, and we immediately performed a controlled rollback to a previously stable version, which resolved the transaction processing errors. 3. **Backlog Clearance:** To clear the substantial backlog of pending transfers in a stable manner, we temporarily paused automated processing and executed the transfers in smaller, controlled batches. 4. **Full Recovery:** All service components returned to normal processing capacity, and the entire backlog of approximately 8,300 transfers was successfully processed by 10:14 AM EST \(15:14 UTC\). Normal API response times and throughput have been fully restored. ### **Commitment & Preventative Measures** We are committed to maintaining the highest level of service reliability. Our team has initiated a detailed post-incident review and is prioritizing the following themes to prevent recurrence: * **System Resilience & Stability:** We are immediately prioritizing and implementing a planned architectural update to the payment service to allow it to handle large batches of pending transfers more robustly and efficiently. * **Monitoring and Alerting:** We will enhance our database monitoring systems to more quickly detect and alert on internal system resource contention \(e.g., database transaction processing issues\), allowing for faster intervention before customer-facing impact occurs. * **Infrastructure Optimization:** We will conduct a comprehensive capacity review of the relevant service components and transaction databases to ensure they can scale effectively under high-volume load and prevent bottlenecks that can lead to system-wide lockups. We appreciate your patience and partnership as we continue to improve the reliability of our platform.