Thycotic incident

Secret Server Cloud: US - RDP/SSH Connectivity Issues

Thycotic experienced a minor incident on January 23, 2026 affecting Secret Server Cloud, lasting 1h 9m. The incident has been resolved; the full update timeline is below.

Started: Jan 23, 2026, 11:38 PM UTC
Resolved: Jan 24, 2026, 12:47 AM UTC
Duration: 1h 9m
Detected by Pingoru: Jan 23, 2026, 11:38 PM UTC

Affected components

Secret Server Cloud

Update timeline

identified Jan 23, 2026, 11:38 PM UTC

We are currently investigating an issue impacting RDP and SSH connections initiated through Secret Server Cloud, specifically when connections are made via the Connection Manager / RDP proxy. Some customers may experience: # Failures when launching RDP or SSH sessions # Black screens during connection attempts # Inability to connect through the RDP proxy, while direct RDP may continue to work We have identified the underlying cause and are actively implementing a fix. Additional updates will be shared as progress is made.
identified Jan 23, 2026, 11:55 PM UTC

We have confirmed the root cause of this issue to be resource contention impacting the RDP/SSH proxy services. To mitigate the issue, we have added additional resources to increase capacity and reduce connection failures. We are continuing to monitor the environment closely to ensure stability and will provide further updates as needed.
monitoring Jan 24, 2026, 12:07 AM UTC

A fix has been implemented to address the resource contention impacting RDP and SSH connections. We are actively monitoring the service, and results indicate that RDP and SSH connections are functioning normally without disconnections.
resolved Jan 24, 2026, 12:47 AM UTC

This incident has been resolved.
postmortem Jan 27, 2026, 04:39 PM UTC

## Incident Overview Customers in the US region experienced intermittent failures when initiating RDP and SSH sessions through Secret Server Cloud, specifically via the Connection Manager / RDP proxy. Start Time: January 23, 2026, 5:45 PM ET End Time: January 23, 2026, 6:45 PM ET Some customers observed: * Failures when launching RDP or SSH sessions * Black screens during connection attempts * Inability to connect via the RDP proxy, while direct RDP access may have continued to work Other regions were not impacted. ## Detection The issue was initially identified through customer support tickets. Internal alerts were received later due to alert thresholds being set too high to detect the early stages of the degradation. ## Root Cause As part of a [planned maintenance](https://status.delinea.com/incidents/r8wkgt5ghcqx) activity in the US region on January 23, customer-facing Azure Service Bus resources were migrated to new hostnames, along with updates to email services and domains. The maintenance completed at 4:30 PM ET. To avoid data loss and disruption to active customer sessions during the maintenance: * A replica set of compute nodes continued to point to the legacy Service Bus * A new set of compute nodes were provisioned to use the new Service Bus hostnames This temporary configuration required approximately double the normal compute capacity, which is not a typical operating condition. As a result: * Some compute nodes remained in a provisioning state longer than expected due to resource constraints in the node pool * Over the following hour, insufficient available capacity existed to process incoming connection requests * This led to RDP and SSH session failures and disconnections via the Connection Manager Once the issue was identified, capacity was increased by: * Expanding the compute node pool * Rebalancing resources away from the replica nodes pointing to the legacy Service Bus These actions allowed the backlog of requests to be processed and restored normal service behavior. This behavior was not observed in lower silos or in other regions during earlier phases of the maintenance period. ## Preventive Actions To prevent recurrence and improve detection during atypical load conditions, the following actions are being implemented: * Lower alert thresholds for message queue depth and processing delays to enable faster detection and response * Reduce alert lookback periods to surface emerging issues sooner during rapid load changes * Introduce a new monitor tracking the percentage of compute pods in an unhealthy state due to resource constraints in the node pool