Elevated Latency Dodgeball Platform
Timeline · 2 updates
- monitoring May 08, 2026, 06:00 PM UTC
A fix has been implemented and we are monitoring results.
- resolved May 08, 2026, 06:05 PM UTC
This incident has been resolved.
Spreedly had 34 outages in the last 2 years totaling 56h 28m of downtime — averaging 1.4 incidents per month.
There were 34 Spreedly outages since May 30, 2024 totaling 56h 28m of downtime. Each is summarised below — incident details, duration, and resolution information.
A fix has been implemented and we are monitoring results.
This incident has been resolved.
We are currently investigating performance of the Dodgeball service.
We are working to resolve an issue with delayed processing of requests to the Dodgeball services.
We’ve implemented a fix and stabilized the Dodgeball component that was experiencing a partial outage. We are continuing to monitor
After closely monitoring and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected. We apologize for any inconvenience or disruption.
This incident has been resolved.
We are currently investigating an issue affecting customers using Dodgeball. The production database backing Dodgeball has reached critically low available storage, which is expected to result in elevated error rates. Customers may begin experiencing errors within the next 10 minutes as the system’s ability to process requests degrades. Our engineering team is actively investigating and working to restore normal capacity as quickly as possible. We will provide further updates as more information becomes available.
The engineering team has successfully resolved the storage issue affecting the Dodgeball production PostgreSQL instance. Capacity has been restored and the system is now operating normally. At this time, we are not observing any elevated error rates for customers using Dodgeball. We will continue to monitor the system closely to ensure stability. Further updates will be provided if necessary.
The storage issue affecting the Dodgeball production instance has been fully resolved. Capacity has been restored and systems are operating normally. We are no longer observing elevated error rates for customers using Dodgeball. Thank you for your patience while we worked to resolve this issue.
Spreedly has detected an issue that is affecting multiple customers using Payflow Pro where some transactions fail with the error: “Invalid Request. Malicious string found in XML document”. While we are providing an early notification in an effort to alert you as quickly as possible, we are still investigating the actual scope and impact and will provide an update as soon as more details are available. Thank you for your patience.
We are no longer seeing errors at this time, and the issue appears to have subsided. We are working with Payflow Pro to identify the root cause. Our team continues to monitor the situation closely to ensure stability. Please continue to follow our StatusPage for the latest updates.
This incident has been resolved.
Spreedly has detected an issue related to a dodgeball component experiencing a partial outage. While we are providing an early notification in an effort to alert you as quickly as possible, we are still investigating the actual scope and impact and will provide an update as soon as more details are available. Thank you for your patience.
We’ve implemented a fix and stabilized the Dodgeball component that was experiencing a partial outage. After closely monitoring and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected. We apologize for any inconvenience or disruption.
We are currently investigating this issue.
We are continuing to investigate this issue.
A fix has been deployed and we are seeing systems recover. We will continue to monitor to ensure stability and will provide further updates as needed.
After a period of monitoring, we have confirmed that the issue has been resolved and systems are operating normally. If you continue to experience any issues, please reach out to Support.
Between 7:00 PM EST on Saturday, December 27th, and 1:00 AM EST on Sunday, December 28th, we identified a period of higher-than-normal latency caused by a pre-planned customer event that resulted in a significant traffic spike. Although we did not observe any service failures or timeouts, our systems experienced degraded service due to the increased latency. Systems recovered quickly once peak traffic subsided and are now operating normally. We expect performance to remain stable moving forward.
The major AWS outage on October 20, 2025, has been largely resolved for most affected services, though some residual issues may persist. Spreedly’s core platform remains stable and fully available. However, due to downstream impact from the AWS outage, some of Spreedly’s indexing endpoints and callback processing are experiencing delays. This is related to dependencies on third-party systems that were affected by the outage and are still undergoing recovery. As a result, you may continue to see intermittent latency or delayed responses for API requests involving these systems. We are actively monitoring the situation as AWS and its downstream services continue recovery efforts. For ongoing updates, please refer to the AWS status page here
After closely monitoring and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected.
### Postmortem: October 24, 2025 — _AWS outage_ Monday's AWS outage had downstream impact on some 3rd-party systems utilized by Spreedly's platform. Namely, our indexing endpoints and callbacks lagged behind due to the outage and recovery steps. ### What Happened Spreedly's 3DS global service and legacy smart routing product both faced a period of failures due to 3rd-party systems relying on AWS. ### Next Steps Spreedly will improve its response to our alerts platform, and are actively working on multi-region support slated for 2026, which should mitigate the impact of similar events going forward. -The Spreedly Team
We are currently investigating elevated 3DS failures with our Global 3DS provider, Seglan.
At 1:08 PM, 3DS provider, Seglan, deployed changes to address the issue. After a period of monitoring, we’ve observed normal performance and are declaring this incident resolved.
On August 19th, 2025, from 9:34 a.m. to 12:55 p.m. EST, an unexpected interruption in 3DS execution occurred during a configuration change in one of our 3DS MPI providers' systems. Customers experienced an increased rate of SCA Authentication failures, resulting in the following error: "Gateway transaction not attempted due to failed SCA authentication." Our 3DS provider has since resolved the issue and is implementing stronger safeguards to prevent this from happening again. This issue is considered resolved, and no further customer impact is expected.
This morning, from 11:43 AM EDT to 11:45 AM EDT, for about two minutes, we experienced a minor issue that caused intermittent 500 errors. Our team has fixed the problem, and we do not anticipate any further impact on customers.
Spreedly has detected an issue that may impact customers, including potentially resulting in failed transactions. While we are providing an early notification in an effort to alert you as quickly as possible, we are still investigating the actual scope and impact and will provide an update as soon as more details are available.
We are continuing to investigate this issue.
The issue has been identified and a fix is being implemented.
A fix has been implemented and we are monitoring the results.
This incident has been resolved.
# **April 14, 2025 — Certificate Issue Impacts Core Requests** On April 14 at 3:47 PM ET, a platform-wide incident occurred that caused all core Spreedly requests to fail. The issue was identified and declared at 3:49 PM, mitigated by 4:13 PM, and fully resolved by 4:32 PM. Total impact duration was 26 minutes. During this window, merchants were unable to complete any transactions through the Spreedly platform due to system-wide connectivity failures. ## What Happened An intermediate certificate on our core service hosts expired April 14th at 3:47pm EST, which broke the certificate chain, preventing HTTP requests between Spreedly internal services and subsequently led to responses of HTTP 500 errors for all requests. ## Next Steps * Spreedly is completing an audit of all certificate chains to identify potential risks and update our catalog of expiration dates. * Spreedly has a plan in place to migrate the management of all certificate chains to our cloud provider.
We are currently investigating an issue where the stylesheet express-mobile-3.1.0.min.css is returning a 404 Not Found error, causing the user interface to break in the Spreedly Express integration.
The issue with the missing stylesheet express-mobile-3.1.0.min.css causing the user interface to break in the Spreedly Express integration has been resolved. We are now actively monitoring the system to ensure the issue does not recur.
The issue has been fully resolved, and the system is functioning as expected. No further incidents have been observed.
We are investigating an elevated rate of 5xx errors affecting Spreedly core services starting at 01:36AM EST. While we are providing an early notification in an effort to alert you as quickly as possible, we are still investigating the actual scope and impact and will provide an update as soon as more details are available. Our team is actively working to identify the root cause and mitigate the impact. We will provide further updates as soon as we have more information. Thank you for your patience.
As of 3:10 AM EST, our system availability has been fully restored and we are seeing stability return. We have implemented a fix and stabilized Spreedly Core Services. Our team is closely monitoring the situation to ensure no further impact.
As of 3:10 AM EST, our system availability has been fully restored and we are seeing stability return. We have implemented a fix and stabilized Spreedly Core Services. Our team is closely monitoring the situation to ensure no further impact.
IMPACT STARTED AT: 2:12 AM EST IMPACT ENDED AT: 2:40 AM EST After closely monitoring Spreedly Core Services and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected. We are completing our investigation concerning the causes of the incident and any residual impact. A post-mortem will be published. We apologize for any inconvenience or disruption.
# March 15th, 2025 — Payment Method Storage Errors _On March 15th, 2025 at 06:12 UTC Spreedly systems encountered a system resource constraint which led to intermittent API call failures resulting in_ `404` _and_ `500` errors being returned to customers. _The unresponsive application containers were restarted allowing the system to return to normal at 06:47 UTC_ ## What Happened On 2025-03-15 between 06:12 and 06:47 UTC, calls to Spreedly’s payment method tokenization endpoints intermittently failed resulting in `HTTP 500` errors. Intermittent elevated rates of `HTTP 404` errors when attempting to use a payment method to transact were also encountered. These failures were the result of internal system resource limitations. ## Next Steps * Spreedly has implemented additional resource monitoring in our automated alerting system to prevent future overruns. * System resources have been adjusted to prevent similar situations in the future.
We are investigating an elevated rate of 5xx errors affecting Spreedly core services. This issue began at 3:56 AM EST. Our team is actively working to identify the root cause and mitigate the impact. We will provide further updates as soon as we have more information. If you have any questions or concerns, please reach out and our support team will be happy to assist.
As of 4:44 AM EST, our systems have been resourcing this issue, and we are seeing stability return. We have implemented a fix and stabilized Spreedly Core Services. Our team is closely monitoring the situation to ensure no further impact.
After closely monitoring Spreedly Core Services and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected. We are completing our investigation concerning the root cause of the incident and any residual impact. A post-mortem will be published. We apologize for any inconvenience or disruption.
# February 16, 2025 — Intermittent API Failures Resulting in 404 and 500 errors _On February 16, 2025 at 08:55 UTC Spreedly systems encountered a system resource constraint which led to intermittent API call failures resulting in_ `404` _and_ `500` errors being returned to customers. _The unresponsive application containers were restarted allowing the system to return to normal at 09:55 UTC._ ### What Happened On 2025-02-16 between 08:55 and 09:55 UTC calls to Spreedly’s payment method tokenization endpoints intermittently failed resulting in `HTTP 500` errors. Intermittent elevated rates of `HTTP 404` errors when attempting to use a payment method to transact were also encountered. These failures were the result of internal system resource limitations. ### Next Steps * Spreedly has implemented additional resource monitoring in our automated alerting system to prevent future overruns. * System resources have been adjusted to slow similar situations in the future.
Spreedly has discovered an issue that could potentially impact new payment method creation. We are taking all necessary measures to investigate the matter and determine the extent of the situation. We will provide an update as soon as more details are available.
From 12:47 to 12:58 PM EST, we encountered an issue that affected customers attempting to tokenize a payment method and transacting with a payment method, resulting in a 500 error. We have identified the problem and implemented a fix. Our team is closely monitoring the situation to ensure there are no further impacts. If you need additional assistance, please reach out to our Support team.
After closely monitoring and confirming that all systems are stabilized and functioning as expected, this incident is considered resolved. No further customer impact is expected. We apologize for any inconvenience or disruption.
# December 12th, 2024 — Payment Method Tokenization Errors Server maintenance to reduce log volume caused an internal service to become unavailable, causing payment method tokenization requests to fail. ## What Happened At approximately 17:45 UTC on December 12th, 2024, a logging configuration change made to reduce log volume resulted in an internal service becoming unavailable. Traffic was shifted to an alternate service during the outage window to reduce impact. At approximately 18:01 the internal service resumed normal operation and service was restored. Payment transactions were negatively impacted during this time. ## Next Steps Spreedly Engineering is investigating the coupling of the configuration change to the internal service and will be taking steps to prevent the issue from occuring in the future.
Spreedly is investigating an issue that could potentially impact our services, leading to 404 errors on the Global Collect gateway. We are taking all necessary measures to investigate the matter and determine the extent of the situation. We will provide an update as soon as more details are available. Thank you for your patience.
We have identified an issue with a code change that was deployed around 3:35 pm EST that affected the Global Collect gateway. We have rolled back this code change and are now seeing successful transactions. After a period of monitoring, we have not seen any recurrence of this issue. The incident is being considered resolved. We will publish a postmortem on our status page as soon as possible. We apologize for any inconvenience this has caused.
# December 11th, 2024 — Global Collect Gateway Transaction Failures A change to the Global Collect gateway integration resulted in transaction failures for a subset of customers. ## What Happened At 20:50 UTC, an update was made to the Global Collect gateway integration. Initial monitoring and observability did not surface issues. At 23:53 UTC, Spreedly Engineering began researching an elevated number of transaction failures on the Global Collect gateway. After investigation, it was determined that the prior update to the Global Collect gateway was causing the increase in transaction failures. At 01:02 UTC on December 12th, 2024, the change was reverted and service to the Global Collect gateway was restored. ## Next Steps Spreedly Engineering is reviewing and improving internal observability to better surface gateway-specific transaction failure rate anomalies.