Redox incident

VPN Degradation for subset of VPN tunnels

Redox experienced a major incident on February 23, 2026 affecting Traffic Processing, lasting 43m. The incident has been resolved; the full update timeline is below.

Started: Feb 23, 2026, 08:31 PM UTC
Resolved: Feb 23, 2026, 09:14 PM UTC
Duration: 43m
Detected by Pingoru: Feb 23, 2026, 08:31 PM UTC

Affected components

Traffic Processing

Update timeline

identified Feb 23, 2026, 08:31 PM UTC

We have identified the issue with our VPN gateways and are currently working on a fix.
identified Feb 23, 2026, 08:38 PM UTC

We have identified the issue with our VPN gateways and are currently working on a fix. A subset of HL7 over VPN traffic will be impacted.
monitoring Feb 23, 2026, 08:55 PM UTC

Our gateways have been restarted and VPN tunnels are coming back up. We are currently monitoring results.
monitoring Feb 23, 2026, 08:59 PM UTC

As of 02/23/2026 at ~2:15 PM Central Time, the Redox MLLP Listener service is experiencing degraded performance for processing inbound MLLP transactions. Inbound and outbound HL7 messages to and from Redox were delayed while we investigated the root cause. Web requests (HTTPS) from Redox were unaffected. We have mitigated the issue and VPN tunnels are back online. If you have any additional questions, please notify us at [email protected].
resolved Feb 23, 2026, 09:14 PM UTC

This incident has been resolved.
postmortem Mar 02, 2026, 03:28 PM UTC

### **Summary** On **February 23, 2026**, between **1:05 AM to 2:05 AM CST** and **1:53 PM to 3:03 PM CST**, a subset of customers experienced elevated latency and connectivity issues across a number of our MLLP VPN Gateways. Service was fully restored once primary gateway instances were cycled and traffic successfully failed over to secondary nodes. ### **What Happened** The outage was triggered by a memory exhaustion event on specific VPN Gateway instances. There was a memory leak in a third party library led to a steady rise in memory consumption, eventually causing the primary instances to become unresponsive. ### **What We Are Doing** To prevent a recurrence and improve our response time, our engineering team is implementing the following measures: * **Stability:** We are deploying updates that mitigate the issue we saw with the third party memory leak to prevent resource exhaustion and non-responsive primary gateway instances. We submitted a ticket with the third party library and they have resolved the issue in the library. * **Enhanced Monitoring:** We are deploying updates to have more granular alerts that monitor MLLP gateway traffic health and reduce time to detection for unhealthy gateways. * **Service Resilience:** We are reviewing our automated health monitoring and failover protocols to ensure the system responds effectively to instance health changes.