Togetherwork incident
[SEV-1] Togetherpay - Intermittent Processing Issues
Togetherwork experienced a major incident on October 14, 2024 affecting Transaction Processing and Payment Tokenization and 1 more component, lasting 40m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Oct 14, 2024, 07:00 PM UTC
ProPay is aware of an issue that is preventing some/possibly all transactions from being processed. They are actively working to resolve the issue. As soon as additional information is known, we will let you know.
- monitoring Oct 14, 2024, 07:08 PM UTC
ProPay implemented a fix and is monitoring the results.
- monitoring Oct 14, 2024, 07:17 PM UTC
We continue to see transaction timeouts. ProPay's fix does not appear to have slowed nor fixed the issue. We continue to monitor the situation.
- monitoring Oct 14, 2024, 07:27 PM UTC
ProPay continues to apply fixes and monitor results. We are seeing a significant improvement in processing. All teams are still monitoring the situation.
- resolved Oct 14, 2024, 07:40 PM UTC
ProPay implemented their fix, monitored results and resolved the incident. We noticed significant timeouts between 2:26 p.m. - 3:18 p.m. eastern. Processing is back to normal. This incident is resolved.
- postmortem Oct 28, 2024, 07:43 PM UTC
Regarding the incident on Monday, October 14 between 2:25 p.m. - 3:25 p.m. eastern: ProPay identified several servers in an unhealthy state after maintenance was performed to relieve an increased size of the replication distribution database. This maintenance caused a disconnect between the application layer and the database layer, resulting in transactions timeouts. Resolution: Support teams performed a failover to the secondary server, which restored service. Additional review is being performed to improve monitoring, alerting and response plans for any similar situations in the future.