Spreedly incident
Core 5xx Errors Impacting Spreedly Core Services
Spreedly experienced a minor incident on February 16, 2025 affecting Core Transactional API, lasting 56m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Feb 16, 2025, 09:43 AM UTC
We are investigating an elevated rate of 5xx errors affecting Spreedly core services. This issue began at 3:56 AM EST. Our team is actively working to identify the root cause and mitigate the impact. We will provide further updates as soon as we have more information. If you have any questions or concerns, please reach out and our support team will be happy to assist.
- monitoring Feb 16, 2025, 10:16 AM UTC
As of 4:44 AM EST, our systems have been resourcing this issue, and we are seeing stability return. We have implemented a fix and stabilized Spreedly Core Services. Our team is closely monitoring the situation to ensure no further impact.
- resolved Feb 16, 2025, 10:39 AM UTC
After closely monitoring Spreedly Core Services and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected. We are completing our investigation concerning the root cause of the incident and any residual impact. A post-mortem will be published. We apologize for any inconvenience or disruption.
- postmortem Feb 20, 2025, 10:16 PM UTC
# February 16, 2025 — Intermittent API Failures Resulting in 404 and 500 errors _On February 16, 2025 at 08:55 UTC Spreedly systems encountered a system resource constraint which led to intermittent API call failures resulting in_ `404` _and_ `500` errors being returned to customers. _The unresponsive application containers were restarted allowing the system to return to normal at 09:55 UTC._ ### What Happened On 2025-02-16 between 08:55 and 09:55 UTC calls to Spreedly’s payment method tokenization endpoints intermittently failed resulting in `HTTP 500` errors. Intermittent elevated rates of `HTTP 404` errors when attempting to use a payment method to transact were also encountered. These failures were the result of internal system resource limitations. ### Next Steps * Spreedly has implemented additional resource monitoring in our automated alerting system to prevent future overruns. * System resources have been adjusted to slow similar situations in the future.