Redox incident

Increased message traffic latency

Minor Resolved View vendor source →

Redox experienced a minor incident on September 29, 2025 affecting Traffic Processing and Logs (view/search), lasting 4h 23m. The incident has been resolved; the full update timeline is below.

Started
Sep 29, 2025, 04:11 PM UTC
Resolved
Sep 29, 2025, 08:35 PM UTC
Duration
4h 23m
Detected by Pingoru
Sep 29, 2025, 04:11 PM UTC

Affected components

Traffic ProcessingLogs (view/search)

Update timeline

  1. investigating Sep 29, 2025, 04:11 PM UTC

    At approximately 10:39am CT, Redox became aware of an issue in which logs are delayed. What this means: you may experience a delay before receiving traffic from Redox. However, please be aware that the actual processing of logs is unaffected; message flow is still occurring, albeit at a slower pace. We are actively looking into this issue and will have a fix available as soon as possible. If you are concerned that your logs have been impacted or have any additional questions, please reach out to [email protected]

  2. investigating Sep 29, 2025, 04:17 PM UTC

    Please note that log visibility may also be delayed. We are continuing to investigate this issue.

  3. identified Sep 29, 2025, 04:40 PM UTC

    We have identified the cause for the log delays and are actively working on a fix. Traffic processing latency is mostly caught up, but visibility may still be delayed. We will continue to update as we have more information.

  4. monitoring Sep 29, 2025, 05:35 PM UTC

    A fix has been implemented for the processing delay and visibility issues and we are actively monitoring our logs. Processing and visibility should be back to normal. If you have questions please reach out to [email protected]

  5. resolved Sep 29, 2025, 08:35 PM UTC

    The incident has been resolved. If you have any additional questions please reach out to [email protected]

  6. postmortem Oct 23, 2025, 12:43 AM UTC

    ## **Summary** At approximately 10:39AM CT, September 29 2025, we observed system-wide degradation creating latency for transaction processing and observability. Latency was resolved for most customers by about 13:30PM CT, Sep 29, 2025. At 4:43AM CT, September 30 2025, we observed a similar system degradation. Latency was resolved for customers by 13:23PM CT, September 30 2025. ## **What Happened** * A confluence of several issues generated a number of errors in our system. These factors included a change in the way we manage database partition rotations, an upgrade of a core library, and the interaction of some services with kubernetes. * These errors caused some services to crash, and although our infrastructure is fault-tolerant and can recover from such crashes, the degree of crashing put pressure on our infrastructure in a way that manifested as latency in transaction processing, and a delay in transaction observability. ## **What we are doing about this:** * We rolled back the upgrade made to the affected third-party library. * We are adding and improving alerting on affected services to provide clearer visibility to the teams that are responsible for those services.