UiPath incident
Canada - Multiple Services - Degraded Performance
Affected components
Update timeline
- investigating Mar 12, 2026, 04:39 PM UTC
We are investigating reports of degraded performance impacting Orchestrator in Canada. Our teams are working to identify the cause and will share more details as the investigation progresses.
- investigating Mar 12, 2026, 05:33 PM UTC
Our teams are continuing to investigate the degraded performance impacting Orchestrator in Canada. The investigation is ongoing, and we are working to identify the root cause. We will share additional updates as more information becomes available.
- investigating Mar 12, 2026, 05:55 PM UTC
Update: Our teams are continuing to investigate the degraded performance impacting services in Canada. The scope has expanded from Orchestrator to all services in the Canada region. The investigation is ongoing, and we are working to identify the root cause. We will share additional updates as more information becomes available.
- monitoring Mar 12, 2026, 06:11 PM UTC
Our teams have identified the issue causing degraded performance impacting services in the Canada region and have implemented a fix. We are currently monitoring the services to ensure stability and will share additional updates if needed.
- monitoring Mar 12, 2026, 06:18 PM UTC
Our teams have identified the issue causing degraded performance impacting customers accessing UiPath from the Canada region and have implemented a fix. We are currently monitoring the services to ensure stability and will share additional updates if needed.
- resolved Mar 12, 2026, 06:28 PM UTC
Our teams have identified and resolved the issue that was causing degraded performance impacting customers accessing UiPath from the Canada region. A fix has been implemented, and services have recovered. We will continue monitoring to ensure ongoing stability.
- postmortem Mar 16, 2026, 07:06 AM UTC
## Customer Impact Customers in the **Canada region** experienced intermittent slowness and timeouts when accessing UiPath Automation Cloud services—including Orchestrator, Action Center, and asset retrieval—over a period of approximately three days. UiPath was in the process of rolling out new gateway infrastructure across regions in a phased manner. The Canada region was one of the regions receiving this rollout. Because the rollout was partial—not all regions were migrated simultaneously—customers whose traffic was routed through the Canada Central gateway were affected. Services in other regions continued to operate normally. * **Impact start**: ~March 9, 2026 16:00 UTC * **Most severe window**: March 11, 08:00–12:00 UTC—request latency at the 99th percentile reached ~5 seconds \(versus a normal baseline of ~300ms\), with over 100,000 requests exceeding five seconds in a 4-hour window * **Impact end**: March 12, 17:45 UTC * **Total duration**: ~73 hours \(intermittent\), with the most acute customer-facing degradation on March 11 During this period, affected customers experienced slow page loads \(particularly Action Center and Orchestrator\), asset retrieval timeouts, and sporadic HTTP 503 errors. The error rate remained low \(peaking at 0.2% of requests\), but latency degradation caused client-side timeouts that manifested as failures for end users. ## Root Cause Customer traffic for the Canada region passes through a regional gateway that directs each request to the correct backend service. The phased rollout began in late February, with 5% of traffic routed through the new infrastructure. Fifty percent of traffic was reached on March 5, then on March 9, all traffic was switched over to the new gateway. While monitoring the gateway during the ramp-up period, no degradation or failures were observed. When all the traffic started hitting the gateway, latency was introduced but not observed. The increase in latency caused some configured thresholds to be exceeded, compounding the issue by queuing requests and triggering restarts of routing services. Routing components were restarted with fresh, empty caches. Normally, these caches allow requests to be resolved in milliseconds. Without them, each request had to fetch routing data from slower backend sources—a process that, under the additional load of the rollout, took longer than the gateway's 5-second timeout. This created a repeating cycle: new instances started up, couldn't respond fast enough, were flagged as unhealthy and restarted — only to start again with empty caches. The cycle was made worse by the Canada Central gateway running at its minimum allocated capacity, leaving insufficient room for old and new instances to run side by side during the rollout. The net effect was that a subset of customer requests experienced delays of up to 15 seconds \(due to automatic retries\), and a small number failed outright with errors. ## Detection This issue was detected via **customer reports** on March 12 at 16:54 UTC \(8:54 am PST\). The ~3-day gap between impact start \(March 9\) and detection \(March 12\) was due to: * **The issue was latency, not outright failure.** Most requests eventually succeeded after retries, so error-rate-based monitors did not trigger. * **Synthetic monitoring for the Canada region did not detect the degradation.** The synthetic checks had timeouts long enough to absorb the 15-second delays, so they continued to report as passing. ## Response Our engineering team immediately began working on this issue upon detection through customer reports on March 12. Within 51 minutes, we resolved the incident by switching traffic back to the previous infrastructure. Services returned to normal operation immediately, and we confirmed that customers could access all services without issues. ## Follow-Up 1. **Add a cache connectivity check to the routing service startup.** New instances will verify cache readiness before accepting traffic, preventing the cold-cache cycle from occurring. 2. **Increase the minimum compute capacity** for the affected gateway cluster to ensure sufficient headroom during infrastructure rollouts. 3. **Add latency-based synthetic monitoring for the Canada region.** Current checks detect total failures—a latency threshold alert would have caught this within the first hour. 4. **Audit all regional gateway clusters** for the same capacity risk to prevent recurrence in other regions. 5. **Add anomaly-detection-based alerts for service restarts, replica set creation.** Although restarts are expected during normal operation, multiple such events in a short timeframe are anomalous and should be immediately investigated.
Looking to track UiPath downtime and outages?
Pingoru polls UiPath's status page every 5 minutes and alerts you the moment it reports an issue — before your customers do.
- Real-time alerts when UiPath reports an incident
- Email, Slack, Discord, Microsoft Teams, and webhook notifications
- Track UiPath alongside 5,000+ providers in one dashboard
- Component-level filtering
- Notification groups + maintenance calendar
5 free monitors · No credit card required