Confluent Outage History

Confluent had 53 outages in the last 2 years totaling 383h 31m of downtime — averaging 2.2 incidents per month.

There were 53 Confluent outages since June 24, 2025 totaling 383h 31m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.confluent.cloud

Notice October 29, 2025

Brief unavailability of produce requests for Enterprise SKU clusters in AWS us-east-1

Detected by Pingoru: Oct 29, 2025, 08:58 PM UTC
Resolved: Oct 29, 2025, 05:00 PM UTC
Duration: —

Timeline · 1 update

resolved Oct 29, 2025, 08:58 PM UTC

Some customers in AWS us-east-1 may have experienced unavailability of produce requests for Enterprise SKU clusters from 3:40 to 4:05 PM UTC.

Read the full incident report →

Major October 29, 2025

Degraded Flink statements in Azure

Detected by Pingoru: Oct 29, 2025, 04:55 PM UTC
Resolved: Oct 30, 2025, 12:42 AM UTC
Duration: 7h 47m

Affected: Confluent Cloud

Timeline · 6 updates

investigating Oct 29, 2025, 04:55 PM UTC

Confluent engineers are engaged on the incident. The impact is related to the ongoing Azure incident. You can find more information here: https://azure.status.microsoft/en-us/status
investigating Oct 29, 2025, 06:12 PM UTC

We are continuing to investigate this issue.
identified Oct 29, 2025, 06:39 PM UTC

Confluent has confirmed that Flink statements running in Azure are currently degraded. Microsoft is actively working to roll back the configuration change.
monitoring Oct 29, 2025, 06:59 PM UTC

Microsoft has indicated they have completed rolling back the network configuration change. Degraded Flink statements have begun recovering.
monitoring Oct 29, 2025, 08:35 PM UTC

Flink statements in Azure have recovered following Microsoft’s rollback of a recent network configuration change. Confluent systems are fully operational at this time, though Microsoft is continuing to monitor the rollback and validate stability. We will remain in monitoring until Microsoft confirms full recovery.
resolved Oct 30, 2025, 12:42 AM UTC

Microsoft reports Azure status as fully operational. The incident is now resolved as of 0:40 UTC on Oct 29th, 2025.

Read the full incident report →

Major October 27, 2025

Customers are encountering a 404 error when attempting to access resources within the Confluent Cloud UI.

Detected by Pingoru: Oct 27, 2025, 05:48 PM UTC
Resolved: Oct 27, 2025, 06:13 PM UTC
Duration: 25m

Affected: Confluent Cloud

Timeline · 3 updates

investigating Oct 27, 2025, 05:48 PM UTC

We are currently investigating the root cause.
investigating Oct 27, 2025, 05:56 PM UTC

We are currently investigating the issue. In the meantime, customers can use the CLI to manage the resources.
resolved Oct 27, 2025, 06:13 PM UTC

This incident has been resolved.

Read the full incident report →

Minor October 22, 2025

Delayed cluster provisioning in Azure eastus region

Detected by Pingoru: Oct 22, 2025, 10:38 PM UTC
Resolved: Oct 25, 2025, 05:14 AM UTC
Duration: 2d 6h

Affected: Confluent Cloud

Timeline · 5 updates

identified Oct 22, 2025, 10:38 PM UTC

We are experiencing provisioning delays in Azure eastus region.
identified Oct 23, 2025, 08:17 PM UTC

We've identified a mitigation method and are rolling this out across affected networks.
identified Oct 23, 2025, 11:56 PM UTC

Provisioning delays have been partially mitigated, but there may still be delays for eastus region.
identified Oct 24, 2025, 05:05 PM UTC

Dedicated private networks and dedicated Kafka clusters provisioning with PrivateLink or VNet Peering are unavailable in the Azure eastus region.
resolved Oct 25, 2025, 05:14 AM UTC

This incident has been resolved.

Read the full incident report →

Critical October 22, 2025

Flink statements failing in multiple regions

Detected by Pingoru: Oct 22, 2025, 04:21 PM UTC
Resolved: Oct 22, 2025, 10:45 PM UTC
Duration: 6h 24m

Affected: Confluent Cloud

Timeline · 6 updates

investigating Oct 22, 2025, 04:21 PM UTC

Flink statements are failing to execute across multiple regions. The impact started at 15:12 UTC on Oct. 22, 2025. Customers may be experiencing increased rate of degraded statements and internal errors. We are currently investigating the issue.
identified Oct 22, 2025, 05:42 PM UTC

The issue causing Flink statement failures has been identified and a hotfix has been prepared. We are actively working to mitigate all impacted compute pools.
identified Oct 22, 2025, 07:04 PM UTC

We are continuing to mitigate Flink statement failures and apply fixes to necessary compute pools. We will post an update in 1 hour, or sooner if mitigation has been completed.
identified Oct 22, 2025, 08:33 PM UTC

A fix has been applied to a number Flink compute pools, but some customers may still see failing statement failures. We are continuing to apply the fixes to the remaining impacted pools. The next update will be posted in 1 hour, or sooner if mitigation has been completed.
monitoring Oct 22, 2025, 10:25 PM UTC

All impacted Flink compute pools have been fixed and degraded statements have now recovered. We are continuing to monitor to ensure no further issues occur.
resolved Oct 22, 2025, 10:45 PM UTC

All impacted Flink compute pools have been fixed and degraded statements have recovered.

Read the full incident report →

Major October 20, 2025

Experiencing Performance Degradation in AWS in US-East-1 region

Detected by Pingoru: Oct 20, 2025, 09:53 AM UTC
Resolved: Oct 21, 2025, 12:05 PM UTC
Duration: 1d 2h

Affected: Confluent Cloud

Timeline · 13 updates

Read the full incident report →

Minor October 17, 2025

Basic/Std clsuters in AWS ap-southeast-1 are experiencing degraded performance

Detected by Pingoru: Oct 17, 2025, 10:44 PM UTC
Resolved: Oct 18, 2025, 05:21 AM UTC
Duration: 6h 36m

Affected: Confluent Cloud

Timeline · 3 updates

investigating Oct 17, 2025, 10:44 PM UTC

We are currently investigating this issue.
monitoring Oct 17, 2025, 11:57 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Oct 18, 2025, 05:21 AM UTC

This incident has been resolved.

Read the full incident report →

Minor October 6, 2025

Cluster provisioning delayed in AWS

Detected by Pingoru: Oct 06, 2025, 07:24 PM UTC
Resolved: Oct 08, 2025, 07:59 AM UTC
Duration: 1d 12h

Affected: Confluent Cloud

Timeline · 5 updates

investigating Oct 06, 2025, 07:24 PM UTC

We are currently experiencing delays in cluster provisioning in AWS. The team is currently investigating.
identified Oct 06, 2025, 09:37 PM UTC

We have identified the root cause as a failure in the underlying disk-mounting mechanism, which has required manual intervention. The impacted nodes have been fixed, and we are preparing to roll out a permanent fix.
identified Oct 07, 2025, 07:46 PM UTC

A fleet-wide deployment containing a permanent fix is underway. We are continuing to monitor the rollout and will update when it is complete.
monitoring Oct 08, 2025, 03:20 AM UTC

The deployment is now complete and the AWS cluster provisioning delays have been mitigated as of 03:20 UTC on Oct 7, 2025. We will monitor for any residual issues before resolving this incident in 1 hour.
resolved Oct 08, 2025, 07:59 AM UTC

This incident has been resolved.

Read the full incident report →

Minor October 3, 2025

Delays in Provisioning affecting Schema Registry, KSQL and Stream Governance resources

Detected by Pingoru: Oct 03, 2025, 04:43 AM UTC
Resolved: Oct 04, 2025, 03:46 PM UTC
Duration: 1d 11h

Affected: Confluent Cloud

Timeline · 4 updates

investigating Oct 03, 2025, 04:43 AM UTC

We are currently experiencing degraded performance in the provisioning flow for Schema Registry, KSQL, and Stream Governance resources. Impact: Newly created clusters and API keys for these services may be delayed. You may also encounter authentication/authorization errors when using newly created keys. Unaffected: Existing clusters, keys, and resources continue to function as expected. Our engineering team is actively investigating and working to restore normal provisioning times. We will provide updates as more information becomes available.
investigating Oct 03, 2025, 08:27 AM UTC

Provisioning Performance Restored Provisioning times for Schema Registry, KSQL, and Stream Governance resources have returned to normal. Our engineering team is continuing to investigate the root cause to ensure long-term stability.
monitoring Oct 03, 2025, 06:26 PM UTC

Provisioning times have returned to normal since the last update and a mitigation has been put in place to prevent further regressions. Confluent will continue to monitoring to ensure that the issues have been resolved.
resolved Oct 04, 2025, 03:46 PM UTC

This incident has been fully resolved.

Read the full incident report →

Major September 30, 2025

AWS us-east-1 Service Operation Failures

Detected by Pingoru: Sep 30, 2025, 04:58 PM UTC
Resolved: Oct 01, 2025, 05:29 PM UTC
Duration: 1d

Affected: Confluent Cloud

Timeline · 6 updates

investigating Sep 30, 2025, 04:58 PM UTC

Confluent is currently experiencing a networking issue in the AWS us-east-1 region that is affecting service operations. Our team has identified the root cause and is actively working on mitigation to restore full service. We will provide updates as progress continues.
identified Sep 30, 2025, 05:01 PM UTC

The root cause has been identified and our team is actively working on mitigation to restore full service
identified Sep 30, 2025, 08:19 PM UTC

Our team continues to actively work on mitigating the networking issue in AWS us-east-1. Progress is being made toward restoring full service.
identified Oct 01, 2025, 12:06 AM UTC

We continue to work on mitigation while some operations have been restored in AWS us-east-1.
monitoring Oct 01, 2025, 01:26 PM UTC

As of 13:19 UTC on October 1st, 2025, this incident has been mitigated. We'll continue monitoring but no longer expect any customer impact.
resolved Oct 01, 2025, 05:29 PM UTC

This incident has been resolved.

Read the full incident report →

Major September 27, 2025

Azure Switzerland North Service Outage

Detected by Pingoru: Sep 27, 2025, 01:00 AM UTC
Resolved: Sep 27, 2025, 11:24 PM UTC
Duration: 22h 24m

Affected: Confluent Cloud

Timeline · 5 updates

investigating Sep 27, 2025, 01:00 AM UTC

As of 23:54 UTC on 26 Sep 2025, customers with clusters in the Azure Switzerland North region may experience downtime due to an Azure region outage. Confluent engineering has been engaged and is investigating the issue.
identified Sep 27, 2025, 03:01 AM UTC

As of 23:54 UTC on 26 Sep 2025, customers with clusters in the Azure Switzerland North region may experience service unavailability or degraded performance due to an Azure region outage. Azure has identified a networking issue and are preparing a hotfix.
identified Sep 27, 2025, 05:37 AM UTC

Starting at 23:54 UTC on 26 September 2025, customers in Switzerland North may experience service unavailability or degraded performances for resources hosted in the region. Virtual Machines may have shutdown to preserve data integrity. We were alerted to this issue by our telemetry informing us in a significant drop in traffic. It was discovered that a recent deployment introduced a malformed prefix in one of the certificates used for connection authorization. We have pinpointed the deployment error involving the certificate prefix and are rolling back the faulty deployment to restore normal traffic flow and service availability. Majority of the impacted services have been fully recovered, and a subset are nearing completion. We continue to monitor traffic and service stability to ensure full recovery.
identified Sep 27, 2025, 09:26 PM UTC

Starting 23:54 UTC on 26 Sep 2025, customers with clusters in the Azure Switzerland North region may experience service unavailability or degraded performance due to an Azure region outage. Azure has fixed the issue and we are working to recover all affected services. The majority of impacted services have been fully recovered, and a subset are nearing completion. We continue to monitor traffic and service stability to ensure full recovery.
resolved Sep 27, 2025, 11:24 PM UTC

All services in the Azure Switzerland North region have now been fully recovered.

Read the full incident report →

Minor September 24, 2025

Transit Gateway Connections UI Shows Failed State

Detected by Pingoru: Sep 24, 2025, 10:46 PM UTC
Resolved: Sep 25, 2025, 11:53 PM UTC
Duration: 1d 1h

Affected: Confluent Cloud

Timeline · 4 updates

investigating Sep 24, 2025, 10:46 PM UTC

As of September 24, 2025 customers may observe that their AWS Transit Gateway connection is in a Failed state from within the Confluent Cloud UI. But connectivity is unaffected. Engineering has been engaged and are investigating the issue.
investigating Sep 25, 2025, 02:41 AM UTC

As of September 24, 2025, some customers may see their AWS Transit Gateway connection displayed as Failed in the Confluent Cloud UI. Connectivity is not impacted, and traffic is flowing normally. Our engineering team is actively investigating.
investigating Sep 25, 2025, 10:23 AM UTC

As of September 25, 2025, customers may observe that their AWS Transit Gateway connection is in a Failed state from within the Confluent Cloud UI. However this shouldn't impact any of your services or connectivity. From our assessment this is likely a problem from AWS and we have escalated to AWS for resolution.
resolved Sep 25, 2025, 11:53 PM UTC

AWS has confirmed that when an external principal is removed from a customer's resource share, Confluent cannot determine if the TGW attachment still exists. This results in the AWS Transit Gateway connection displaying in the Failed state from within the Confluent Cloud UI. Affected users should validate the TGW resource share is in place per our documentation (https://docs.confluent.io/cloud/current/networking/aws-transit-gateway.html#add-an-aws-transit-gateway-attachment-in-ccloud).

Read the full incident report →

Minor September 18, 2025

Confluent Cloud API outage

Detected by Pingoru: Sep 18, 2025, 08:27 PM UTC
Resolved: Sep 18, 2025, 11:19 PM UTC
Duration: 2h 51m

Affected: Confluent Cloud

Timeline · 3 updates

monitoring Sep 18, 2025, 08:27 PM UTC

We experienced issues related to our identity service which has impacted all confluent cloud management APIs. This resulted in an outage on our UI, API, CLI and Terraform interfaces. We have mitigated the issue and are now monitoring the APIs to ensure they are stable.
monitoring Sep 18, 2025, 08:27 PM UTC

We are continuing to monitor for any further issues.
resolved Sep 18, 2025, 11:19 PM UTC

This incident has been resolved.

Read the full incident report →

Minor September 4, 2025

Degraded Kafka performance affecting some Confluent Cloud clusters

Detected by Pingoru: Sep 04, 2025, 09:42 AM UTC
Resolved: Sep 08, 2025, 05:49 PM UTC
Duration: 4d 8h

Affected: Confluent Cloud

Timeline · 6 updates

identified Sep 04, 2025, 09:42 AM UTC

We are experiencing issues related to internal topic compaction causing topic size to grow. This is impacting cluster performance in a limited number of clusters with extreme cases. We have identified the root cause and are actively working on a fix. Next update soon.
identified Sep 04, 2025, 12:43 PM UTC

Mitigation is still in progress. Next update in 3 hours
identified Sep 04, 2025, 02:58 PM UTC

Mitigation is in progress. Next update in 2 hours.
monitoring Sep 04, 2025, 05:05 PM UTC

We have deployed mitigation steps. We'll continue monitoring but no longer expect any customer impact.
monitoring Sep 08, 2025, 05:47 PM UTC

Issues related to internal topic compaction have been resolved.
resolved Sep 08, 2025, 05:49 PM UTC

This incident has been resolved.

Read the full incident report →

Minor September 2, 2025

Delays in KSQL and Schema Registry new cluster creation

Detected by Pingoru: Sep 02, 2025, 11:06 PM UTC
Resolved: Sep 03, 2025, 12:22 AM UTC
Duration: 1h 16m

Affected: Confluent Cloud

Timeline · 3 updates

investigating Sep 02, 2025, 11:06 PM UTC

We are experiencing delays in our data pipeline which is impacting KSQL and Schema Registry cluster creation in all regions and cloud providers. We are currently investigating the issue and will post an update in the next 30 minutes.
monitoring Sep 02, 2025, 11:22 PM UTC

We have deployed mitigation steps and the delays have improved. We'll continue monitoring but no longer expect any customer impact.
resolved Sep 03, 2025, 12:22 AM UTC

This incident has been resolved.

Read the full incident report →

Minor August 18, 2025

Potential delays in new or updated authentication rule enforcement and potential authentication failures for some OAUTH Kafka customers - All clusters - All regions

Detected by Pingoru: Aug 18, 2025, 06:18 PM UTC
Resolved: Aug 18, 2025, 07:45 PM UTC
Duration: 1h 26m

Affected: Confluent Cloud

Timeline · 4 updates

investigating Aug 18, 2025, 06:18 PM UTC

Starting at 08:00 UTC on August 18, 2025, some customers who have created or updated role based access control policies during this incident may see a delay in those new rules getting applied. Additionally, customers who have configured OAUTH where their public key is set to be refreshed during this incident may see additional authentication failures. This can impact clusters in any cluster regardless of region or cloud provider. We are currently investigating the issue. We will post an update in 30 minutes.
investigating Aug 18, 2025, 06:52 PM UTC

We are continuing to investigate. The potential impact also includes customer who have configured OAUTH and have changed their public key in any way, including refreshing it or creating a new public key during this incident. We will post an update in 60 minutes, or sooner if mitigation has been achieved.
monitoring Aug 18, 2025, 07:39 PM UTC

As of 19:38 UTC on August 18, 2025, this incident has been mitigated. We'll continue monitoring but no longer expect any customer impact.
resolved Aug 18, 2025, 07:45 PM UTC

This incident has been resolved.

Read the full incident report →

Critical August 6, 2025

Performance degradation for resources in GCP us-central1-a

Detected by Pingoru: Aug 06, 2025, 03:11 AM UTC
Resolved: Aug 06, 2025, 05:05 AM UTC
Duration: 1h 54m

Affected: Confluent Cloud

Timeline · 7 updates

investigating Aug 06, 2025, 03:11 AM UTC

This manifests itself as networking errors on affected clusters. We are currently investigating and attempting to mitigate or provide a work-around.
investigating Aug 06, 2025, 03:24 AM UTC

Affected resources are limited to a single zone in GCP, us-central1-a. We are continuing to investigate.
investigating Aug 06, 2025, 03:47 AM UTC

GCP has confirmed that starting, August 6th 12:46 PM UTC, us-central1 is experiencing packet loss causing elevated latency and errors. GCP have communicated that they will provide an update shortly
investigating Aug 06, 2025, 04:00 AM UTC

Resources in multiple zones in GCP us-central1 are affected. We are continuing to investigate
monitoring Aug 06, 2025, 04:28 AM UTC

Many clusters in the affected region are recovering. We are continuing to monitor.
monitoring Aug 06, 2025, 04:53 AM UTC

We are seeing mitigation for all impacted Kafka clusters in GCP us-central1 and are continuing to monitor.
resolved Aug 06, 2025, 05:05 AM UTC

This incident has been resolved.

Read the full incident report →

Minor July 23, 2025

Confluent Cloud Metrics API is currently experiencing elevated error rates on certain historical queries for longer time ranges

Detected by Pingoru: Jul 23, 2025, 04:34 PM UTC
Resolved: Jul 23, 2025, 05:51 PM UTC
Duration: 1h 17m

Affected: Confluent Cloud

Timeline · 2 updates

monitoring Jul 23, 2025, 04:34 PM UTC

This issue started on July 23rd, 2025 08:00 UTC, and has been mitigated as of July 23rd, 2025 15:17 UTC. At this time, we have deployed a fix for the issue and are monitoring to see recovery.
resolved Jul 23, 2025, 05:51 PM UTC

We have verified that the fix rolled out works as expected and no longer see errors. All historical queries with longer time ranges should work as normal.

Read the full incident report →

Notice July 15, 2025

We are experience degraded performance in AWS US east-1 due to a zonal outage.

Detected by Pingoru: Jul 15, 2025, 02:21 AM UTC
Resolved: Jul 15, 2025, 02:21 AM UTC
Duration: —

Timeline · 1 update

resolved Jul 15, 2025, 02:21 AM UTC

This incident has been resolved.

Read the full incident report →

Notice July 14, 2025

Customers might see connectivity issues with confluent connectors, as well missing metrics and audit logs for a few clusters in AWS us-west-2

Detected by Pingoru: Jul 14, 2025, 07:36 PM UTC
Resolved: Jul 15, 2025, 02:20 AM UTC
Duration: 6h 43m

Affected: Confluent Cloud

Timeline · 4 updates

monitoring Jul 14, 2025, 07:36 PM UTC

A fix has been implemented and we are monitoring the results.
monitoring Jul 14, 2025, 08:31 PM UTC

We are continuing to monitor for any further issues.
monitoring Jul 14, 2025, 08:32 PM UTC

We are continuing to monitor for any further issues.
resolved Jul 15, 2025, 02:20 AM UTC

This incident has been resolved.

Read the full incident report →

Minor July 11, 2025

Customers may face issues with running or newly submitted Flink statements in AWS us-west-2

Detected by Pingoru: Jul 11, 2025, 04:58 PM UTC
Resolved: Jul 12, 2025, 12:06 AM UTC
Duration: 7h 8m

Affected: Confluent Cloud

Timeline · 3 updates

investigating Jul 11, 2025, 04:58 PM UTC

Confluent is observing similar symptoms to the earlier issue: https://status.confluent.cloud/incidents/c4nvjkxz5vqc Investigation and mitigation efforts are underway.
monitoring Jul 11, 2025, 10:55 PM UTC

We applied a mitigation that has bought the Flink service to a healthy status and are currently monitoring the service.
resolved Jul 12, 2025, 12:06 AM UTC

This incident has been resolved.

Read the full incident report →

Major July 11, 2025

Confluent Cloud Flink - Customers may face issues with running or newly submitted Flink statements in AWS US-West2

Detected by Pingoru: Jul 11, 2025, 09:18 AM UTC
Resolved: Jul 11, 2025, 12:43 PM UTC
Duration: 3h 25m

Affected: Confluent Cloud

Timeline · 5 updates

investigating Jul 11, 2025, 09:18 AM UTC

We are currently investigating the issue and attempting to mitigate or provide a work-around.
investigating Jul 11, 2025, 10:32 AM UTC

We are continuing to investigate this issue.
identified Jul 11, 2025, 10:33 AM UTC

The issue has been identified and a fix is being implemented.
monitoring Jul 11, 2025, 11:43 AM UTC

The issue has been fixed, Flink statements are running as expected and new statements can be submitted. Ongoing Monitoring.
resolved Jul 11, 2025, 12:43 PM UTC

This incident has been resolved.

Read the full incident report →

Minor July 8, 2025

Confluent Cloud Customers accessing Freight Brokers in the AWS ap-south1 region might experience degraded service

Detected by Pingoru: Jul 08, 2025, 08:21 PM UTC
Resolved: Jul 09, 2025, 02:15 AM UTC
Duration: 5h 54m

Timeline · 4 updates

investigating Jul 08, 2025, 08:21 PM UTC

We are currently investigating this issue.
identified Jul 08, 2025, 09:56 PM UTC

The issue has been identified and the mitigation steps are being executed.
monitoring Jul 08, 2025, 10:34 PM UTC

The issue has been mitigated and Confluent is continuing to monitor the health of Freight brokers in AWS ap-south1 region.
resolved Jul 09, 2025, 02:15 AM UTC

This incident is resolved.

Read the full incident report →

Major July 7, 2025

Flink APIs to fetch regions and compute pools are impacted globally

Detected by Pingoru: Jul 07, 2025, 06:58 PM UTC
Resolved: Jul 07, 2025, 08:47 PM UTC
Duration: 1h 48m

Timeline · 2 updates

monitoring Jul 07, 2025, 06:58 PM UTC

Starting Jul 7, approximately 17:39 UTC, Flink APIs to fetch regions and compute pools encountered failures. The issue was mitigated approximately Jul 7, 18:25 UTC and Confluent is actively monitoring the status.
resolved Jul 07, 2025, 08:47 PM UTC

The incident has been resolved and APIs are functioning correctly.

Read the full incident report →

Notice July 5, 2025

Cluster produce/consume slowness observed in AWS ME-South-1

Detected by Pingoru: Jul 05, 2025, 02:48 AM UTC
Resolved: Jul 05, 2025, 02:00 AM UTC
Duration: —

Timeline · 1 update

resolved Jul 05, 2025, 02:48 AM UTC

Between 02:00 UTC and 02:05 UTC some customers running clusters in AWS ME-SOUTH-1 may have experienced increased errors or slowness of their producer and consumers. This issue was detected and auto-mitigated.

Read the full incident report →