Okta incident

Workflows issue

Major Resolved View vendor source →

Okta experienced a major incident on August 8, 2024 affecting okta.com cell 1 and okta.com cell 2 and 1 more component, lasting 396d 19h. The incident has been resolved; the full update timeline is below.

Started
Aug 08, 2024, 10:17 PM UTC
Resolved
Sep 09, 2025, 05:44 PM UTC
Duration
396d 19h
Detected by Pingoru
Aug 08, 2024, 10:17 PM UTC

Affected components

okta.com cell 1okta.com cell 2okta.com cell 3okta.com cell 4okta.com cell 6okta.com cell 7okta.com cell 11Workflows

Update timeline

  1. resolved Aug 08, 2024, 10:17 PM UTC

    Our Workflows team is investigating the issue impacting Workflows execution for a subset of Okta cells (US-Cell 1, 2, 3, 4, 6, 7, & 11). During this time, Okta Admins may experience slowness or time-outs on running the flows. Our Workflows team is performing corrective actions to mitigate the issues related to this degradation. We'll provide an update in 30 minutes or sooner if additional information becomes available. 3:29pm PDT: Our Workflows team continues to address the timeouts in Workflows executions for a subset of Okta US Cells with top priority. The team has identified the potential root cause and managing to adjust the configuration to mitigate the issue in the FL1 service. We'll provide an update in 30 minutes or sooner if additional information becomes available. 3:32pm PDT: The issue impacting FL1 has been addressed. Okta’s monitoring shows recovery to normal conditions. Additional root cause information will be available within 5 business days.

  2. resolved Aug 15, 2024, 10:58 PM UTC

    We sincerely apologize for any impact this incident has caused to you, your business, and your customers. At Okta trust and transparency are our top priorities. Outlined below are the facts regarding this incident. We are committed to implementing improvements to the service to prevent future occurrences of this incident. Detection and Impact On August 8th at 2:29 pm PT, Okta detected interruptions with customers using Workflows in US FL1. Impacted customers experienced timeout errors when attempting flow executions or any operations in workflow designers. Root Cause Summary The root cause of this incident was an internal service issuing a high volume of concurrent requests against our cache database and overwhelming it during a period of high activity. This caused degraded performance in the Workflows system and resulted in the timeout errors customers experienced. Remediation Steps Upon receiving alerts, Okta began diagnosing the issue. From 3:10 pm - 3:16 pm PT Okta observed that the service began to recover on its own. The Okta team continued to observe its progress. At 3:26 pm PT, processing returned to normal rates. Preventative Actions Okta is currently engaged in making performance optimizations in this environment to address the recent issue and ensure stability. We are also engaging in longer term architecture improvements to remove this cache. Total Duration: 57 minutes August 8th 2:29 pm - 3:26 pm PT