Cloudera Outage History
Cloudera is up right nowCloudera had 19 outages in the last 2 years totaling 426h 36m of downtime — averaging 0.8 incidents per month.
There were 19 Cloudera outages since June 5, 2024 totaling 426h 36m of downtime. Each is summarised below — incident details, duration, and resolution information.
Intermittent Management Console Access Issues Across US, EU, and AP Regions
Timeline · 2 updates
FreeIPA connectivity issues
Timeline · 3 updates
Intermittent Performance and Access Issues with the Cloudera Management Console
Timeline · 4 updates
DataHubs, DataLakes and FreeIPA are unreachable in US region
Timeline · 3 updates
Daily consumption report is not showing current usage data
Timeline · 4 updates
- identified Jun 23, 2025, 12:35 PM UTC
The daily consumption report that is available on the Cloudera Management Console has not been reporting current usage data since Thu Jun 19th 2025. Our teams have identified the source of the issue and are working on implementing a solution to make the data available by this Friday Jun 27th 2025, if not sooner. While the team is working on resolving the issue customers may use https://my.cloudera.com/account/usage to avail their consumption reports.
- monitoring Jun 24, 2025, 07:08 AM UTC
Our teams have successfully deployed the fix. The data pipeline is running, and data up to 21st June is available. The backlog in consumption reporting data will be processed gradually. Meanwhile, customers may use https://my.cloudera.com/account/usage to get the consumption reports. Our teams are actively monitoring the situation.
- resolved Jun 27, 2025, 06:10 AM UTC
Our teams have confirmed that the issue has been resolved. The data pipelines have now caught up on the backlogged consumption reporting data. If you are still experiencing issues or have any questions, please raise a support case with us. A root cause analysis (RCA) will be published within seven business days.
- postmortem Jul 02, 2025, 06:14 AM UTC
On 23rd of June, we were notified that our Consumption Dashboards were not displaying current usage data from June 19th, 2024. Upon further investigation, we identified that the pipeline that ingests data into consumption was not progressing due to out-of-memory crash errors. The issue was caught and fixed, but the catch-up of the pipeline caused a delay of 2-3 days. While the reprocessing was happening in the background, customers could still use [https://my.cloudera.com/account/usage](https://my.cloudera.com/account/usage) to get the same consumption reports. Our teams have instituted preemptive measures and refined our internal protocols to manage similar incidents proactively in the future. We apologize for any inconvenience this may have caused and remain committed to implementing all necessary measures to prevent the recurrence of such issues.
Intermittent 504 errors on the Data Catalog
Timeline · 2 updates
- resolved Jun 06, 2025, 02:58 PM UTC
A temporary interruption in access to the DataCatalog service was detected by our internal monitoring systems. This interruption has been classified as transient, indicating that it was not a prolonged or persistent outage, though it did affect accessibility to the service for a brief period of time. Our technical teams are actively investigating the root cause of this disruption to understand precisely what occurred. An update will be disseminated as soon as the investigation concludes and a definitive explanation for the disruption is established.
- postmortem Jun 16, 2025, 01:40 PM UTC
On June 3rd 2025 our internal systems detected intermittent 504 errors on the Data Catalog service UI. The interruption was caused by Data Catalog service components prematurely indicating they were ready to handle requests immediately after a scheduled system restart. In reality, these components were still performing essential internal data updates, a process which took approximately 24 minutes. During this period, the service was unable to fully process user requests, leading to intermittent access issues and 504 errors. To prevent similar occurrences, we have enhanced our system's readiness checks to ensure the Data Catalog service is fully prepared to serve traffic. We apologize for any inconvenience caused by the service disruption. We are committed to providing a reliable and robust platform and truly appreciate your understanding.
Intermittent FreeIPA Service disruption in US Control Plane.
Timeline · 5 updates
- investigating Apr 03, 2025, 11:41 AM UTC
Current Status: We are currently investigating a potential issue with US control plane service. We will have an update within 60 mins. Customer Experience: Login and CDP operations may be impacted.Running workload on clusters should not be impacted. Incident Start time: 6:30 UTC April 3rd, 2025
- investigating Apr 03, 2025, 12:42 PM UTC
Current Status: Our teams are continuing their investigation to determine the source of the issue. We will have another update within 60 mins. Customer Experience: Login and CDP operations may be impacted.Running workload on clusters should not be impacted. Incident Start time: 6:30 UTC April 3rd, 2025.
- investigating Apr 03, 2025, 01:47 PM UTC
Current Status: Our teams are continuing their investigation to determine the source of the issue. We have ruled out that its a Workload issue. We will have another update within 60 mins. Customer Experience: Login and CDP operations may be impacted. Incident Start time: 6:30 UTC April 3rd, 2025.
- resolved Apr 03, 2025, 03:25 PM UTC
Current Status: Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: Login and CDP operations via portal was impacted. API & CLI operations were working fine. Also workloads were not impacted. Incident Start time: 6:30 UTC April 3rd, 2025. Incident End time: 15:15 UTC April 3rd, 2025.
- postmortem Apr 15, 2025, 07:49 PM UTC
On 3rd April 2025, there was an intermittent service disruption impacting login and CLI operations via the Management Console portal. Subsequent investigation identified the root cause stemmed from an increase in latency with internal DNS servers. In response, corrective measures were promptly taken to address the issue. Furthermore, we have implemented supplementary checks and preemptive actions to prevent similar occurrences in the future. We apologise for any inconvenience caused by the service disruption. We are fully committed to providing a reliable and robust platform and truly appreciate your understanding.
FreeIPA Service disruption in EU Control Plane
Timeline · 2 updates
- resolved Mar 12, 2025, 06:02 PM UTC
Current Status: Our teams have identified the source of the intermittent issue impacting the FreeIPA service in the EU Control Plane. We have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. We sincerely apologize for any inconvenience this may have caused and we will have the root cause analysis (RCA) published within seven business days. Customer Experience: During this outage customers on the EU Control Plane may have experienced intermittent issues with the FreeIPA service, impacting its health status and autoscaling operations.
- postmortem Apr 09, 2025, 01:50 PM UTC
On March 12, 2025, intermittent service disruptions were experienced due to an expired certificate served by an endpoint. Despite an automated successful certificate renewal, staggered pod restarts resulted in inconsistencies, with older ingress controller pods serving the expired certificate alongside a recently updated pod. This discrepancy led to intermittent user errors. Resolution was achieved through a systematic restart of all affected components, ensuring uniform application of the renewed certificate and restoring service consistency. We have also implemented measures to prevent similar occurrences. We sincerely apologize for any inconvenience this service disruption may have caused. We are committed to maintaining the highest standards of service reliability and appreciate your understanding.
EU Control Plane Service Disruption
Timeline · 5 updates
CDP Management Console not accessible in US Control Plane
Timeline · 3 updates
- investigating Jan 14, 2025, 08:46 PM UTC
Current Status: We are currently investigating a potential issue with CDP Management Console. We will have an update within 60 mins. Customer Experience: Customer may observe 404/Bad Gateways error while accessing CDP Management Console.
- resolved Jan 14, 2025, 09:41 PM UTC
Current Status: Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: Customer may observe 404/Bad Gateways error while accessing CDP Management Console.
- postmortem Jan 27, 2025, 04:04 PM UTC
On January 13th, 2025, a few of our customers experienced service disruption with the Cloudera Management Console user interface, leading to customers encountering 404 and 5xx errors when navigating the console. This disruption stemmed from an unusually high volume of notification events. We have taken immediate action to handle this unexpected surge in events. We sincerely apologise for any inconvenience this may have caused and greatly appreciate your patience and understanding.
Authentication Issues while accessing workload URLs on the US Control Plane region
Timeline · 2 updates
- resolved Jan 13, 2025, 03:08 PM UTC
Current Status: Our teams have identified an issue effecting the authentication services while accessing workload URLs on the US Control Plane region. While a majority of the authentication request were working as expected, a small percentage of requests were impacted. The issue has been addressed and our teams are continuing to monitor it closely. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: During this outage customers may have experiences issues while accessing workload URLs on the US Control Plane region.
- postmortem Jan 23, 2025, 02:58 PM UTC
On Jan 9th, Cloudera implemented a planned system change, affecting how customers accessed their CDP workloads URL's. This change inadvertently caused issues for customers who accessed their CDP workloads URL's before authenticating. Specifically, customers who had their identity provider \(IDP\) configured to expect a certain URL were experiencing authentication failures. This affected customers who initiated the authentication process from the service provider's side. Cloudera promptly rolled back the system change, restoring normal access to CDP workload URLs. We sincerely apologize for any inconvenience this disruption may have caused and appreciate your understanding and patience during this resolution.
Performance degraded in Control Plane US Prod Region.
Timeline · 2 updates
- resolved Dec 02, 2024, 07:29 PM UTC
Current Status: Our teams have identified an issue/degraded performance in the Control Plane US Prod Region. The issue has been addressed and our teams are continuing to monitor it closely. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: During this outage customers may have experiences issues with Control Plane operations. Incident Start time: 17:20 UTC December 2nd, 2024 Incident End time: 17:33 UTC December 2nd, 2024.
- postmortem Dec 13, 2024, 02:48 PM UTC
On December 2, 2024, between 17:20 - 17:33 UTC, our US Control Plane experienced a 13-minute service degradation. This incident occurred during routine maintenance of our internal infrastructure, where an error caused the unintentional degradation of nodes. We have taken immediate action to enhance our internal protocols and implement preventive measures to avoid similar incidents in the future. We apologise for any inconvenience caused and remain dedicated to ensuring a seamless experience for our valued customers. Thank you for your patience and understanding
Cloudera Single sign-on (CSSO) login interruption
Timeline · 2 updates
- identified Nov 15, 2024, 12:41 PM UTC
Due to a recent disruption in Salesforce services, customers who use Cloudera Single Sign-On (CSSO) for accessing their Public Cloud Management console may experience login issues. Customers utilizing their own Identity Providers (IDPs) for Single Sign-On will not be affected by this service interruption. We appreciate your understanding and patience as we work through this matter. If you have any questions or need assistance, please do not hesitate to reach out to our support team at US: 1-866-809-4659 / International: 1-650-425-9421 Thank you for your continued support.
- resolved Nov 15, 2024, 04:01 PM UTC
Our teams confirmed that our Cloudera Single Sign-on (CSSO) is now stable and operational. We apologize for any inconvenience this may have caused. If you are still experiencing issues or have any questions please raise a support case with us. Thank you for your continued support.
Cloudera Data Engineering (CDE) - New Installs & In-place upgrade fails
Timeline · 3 updates
- identified Sep 18, 2024, 10:03 PM UTC
Our teams have identified an issue that impacts new installs and upgrades of the Cloudera Data Engineering (CDE) DataService on AWS. We are working a fix and we expect to know the ETA for the fix by the end of this week. We advise against planning any CDE environment upgrades till a fix is made available. Alternatively, you may seek assistance from our support team, who can provide guidance on implementing a manual workaround for the issue. Customer Experience: AWS Customers may observe issues while installing or upgrading Cloudera Data Engineering (CDE) DataServices. Further updates will be shared as we make progress on releasing the fix.
- identified Sep 20, 2024, 03:03 PM UTC
During the course of the investigation, we have been notified that customers deploying new Cloudera Data Engineering (CDE) services on Azure are also experiencing issues. This issue only affects new installations and upgrades of the CDE services; other operations are not impacted. We recommend that customers refrain from performing any upgrades until the fix is released, which is targeted for the end of this month. In the meantime, our support team is available to provide guidance on implementing a manual workaround for the issue. We regret any inconvenience caused by this situation and are working to have this resolved soon.
- resolved Sep 30, 2024, 08:09 PM UTC
Current Status: We have successfully deployed a fix for the issue and confirmed that it has been resolved. Customers may now resume any pending upgrade activities on their Cloudera Data Engineering (CDE) services. We sincerely apologize for any inconvenience caused by this service disruption and are committed to providing a reliable and robust platform. We truly appreciate your understanding. If you are still experiencing issues or have any questions, please raise a support case with us.
FreeIPA servers shows unreachable
Timeline · 3 updates
- investigating Aug 29, 2024, 06:09 PM UTC
Current Status: We are currently investigating a potential issue with FreeIPA service. We will have an update within 60 mins. Customer Experience: Customer may observe issues while performing control plane operations on environments.
- resolved Aug 29, 2024, 06:53 PM UTC
Current Status: Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: During this window customers may have experienced issues with FreeIPA service showing unreachable. Incident Start time: 17:21 August 29th, 2024 UTC Incident End time: 18:11 August 29th, 2024 UTC
- postmortem Sep 12, 2024, 03:35 PM UTC
On August 29, 2024, our FreeIPA service experienced connectivity issues to our Control Plane service. Subsequent investigation identified the root cause stemmed from expiration of the CCMv2 certificate. In response, corrective measures were promptly taken to address the issue. Furthermore, we have implemented supplementary checks and preemptive actions to prevent similar occurrences in the future. We apologise for any inconvenience caused by the service disruption. We are fully committed to providing a reliable and robust platform and truly appreciate your understanding.
Users unable to login to CDP Public Cloud console
Timeline · 2 updates
- resolved Jul 16, 2024, 09:05 PM UTC
Current Status: Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: During this window customers who tried accessing their CDP Console using CSSO credentials were impacted
- postmortem Jul 23, 2024, 02:52 PM UTC
On 13th July 2024, Cloudera SSO logins to the CDP Public Cloud Console was affected due to the expiration of an internal certificate on our customer portal. The issue was rectified through the installation of new certificates. In response, we have instituted preemptive measures and refined our internal protocols to proactively manage similar incidents in the future. We apologise for any inconvenience this may have caused and remain committed to implementing all necessary measures to prevent the reoccurrence of such issues.
Cloudera AI Inference and Model registry Creation Failures
Timeline · 6 updates
- investigating Jun 28, 2024, 02:25 AM UTC
Current Status: We are currently investigating a potential issue with Cloudera AI Inference and Model registry. We will have an update within 60 mins. Customer Experience: Customers may observe issues while doing new deployments and model registry upgrade activity. Incident Start time: 5 PM UTC June 25th, 2024
- identified Jun 28, 2024, 03:27 AM UTC
Current Status: Our teams have identified the source of the issue. We are working on developing and implementing a solution to restore the service(s). We will have another update within 60 mins. Customer Experience: Customers may observe issues while doing new deployments and model registry upgrade activity. Incident Start time: 5 PM UTC June 25th, 2024
- identified Jun 28, 2024, 04:26 AM UTC
Current Status: Our teams have identified and fixed the issue in US region. We are still investigating the issue in AP and EU regions. We will have another update within 60 mins. Customer Experience: Customers may observe issues while doing new deployments and model registry upgrade activity. Incident Start time: 5 PM UTC June 25th, 2024
- identified Jun 28, 2024, 05:34 AM UTC
Current Status: We continue investigating the issue in AP and EU regions. US region is functioning as expected. We will have another update within 60 mins. Customer Experience: Customers may observe issues while doing new deployments and model registry upgrade activity. Incident Start time: 5 PM UTC June 25th, 2024
- resolved Jun 28, 2024, 11:50 AM UTC
Current Status: Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: Customers may observe issues while doing new deployments and model registry upgrade activity. Incident Start time: 5 PM UTC June 25th, 2024 Incident End time: 5:34 AM UTC June 28th, 2024
- postmortem Jul 16, 2024, 06:08 AM UTC
On June 27, 2024, Our internal monitoring detected an issue with new deployments and upgrades of the Cloudera AI Inference and Model registry on our US control plane. Further investigation revealed configuration issues between our internal micro-services, the changes were made to address the issue and corrective actions have been implemented. We apologize for any inconvenience caused by the service disruption. We are fully committed to providing a reliable and robust platform and truly appreciate your understanding.
FreeIPA is unable to reach the US West Control Plane
Timeline · 6 updates
- investigating Jun 05, 2024, 05:44 AM UTC
Investigating: Current Status: We are currently investigating a potential issue with FreeIPA service. We will have an update within 60 mins. Customer Experience: Customer may observe issues while performing control operations on environments in the US West. Incident Start time: 05:00 UTC June 5th, 2024
- investigating Jun 05, 2024, 06:28 AM UTC
We are continuing to investigate this issue.
- identified Jun 05, 2024, 06:33 AM UTC
Current Status: Our teams have identified the source of the issue. We are working on developing and implementing a solution to restore the service(s). We will have another update within 60 mins. Customer Experience: During this window, Customer may observe issues while performing control operations on environments in the US West. Incident Start time: 05:00 UTC June 5th, 2024
- monitoring Jun 05, 2024, 07:03 AM UTC
A fix has been implemented and we are monitoring the results.
- resolved Jun 05, 2024, 07:22 AM UTC
Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days. Customer Experience: Customer may observe issues when accessing control plane for US West. Incident Start time: 01:00 UTC June 5th, 2024 Incident End time: 07:07 UTC June 5th, 2024
- postmortem Jun 12, 2024, 06:05 PM UTC
On June 5, 2024, our FreeIPA service experienced connectivity issues to our Control Plane service. Subsequent investigation identified the root cause stemmed from recent ingress changes implemented the day prior. In response, corrective measures were promptly taken to address the issue. Furthermore, we have implemented supplementary checks and preemptive actions to prevent similar occurrences in the future. We apologise for any inconvenience caused by the service disruption. We are fully committed to providing a reliable and robust platform and truly appreciate your understanding.