Replicated Outage History

Replicated had 40 outages in the last 2 years totaling 376h 18m of downtime — averaging 1.6 incidents per month.

There were 40 Replicated outages since June 17, 2024 totaling 376h 18m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.replicated.com

Major June 1, 2026

Upstream Quay.io degrading push/pull in Image Proxy

Detected by Pingoru: Jun 01, 2026, 03:19 PM UTC
Resolved: Jun 01, 2026, 04:39 PM UTC
Duration: 1h 19m

Affected: Quay.io Registry

Timeline · 3 updates

monitoring Jun 01, 2026, 03:19 PM UTC

A fix has been implemented and we are monitoring the results.
monitoring Jun 01, 2026, 04:00 PM UTC

We're continuing to monitor upstream notices from Redhat regarding the quay.io incident. Latest messaging is that Pulls are operational.
resolved Jun 01, 2026, 04:39 PM UTC

The upstream incident with Quay.io which saw failures to Push and Pull images from their registry is improving. We're seeing significantly better success rates for pull operations through the Proxy Registry. At this time impact to Replicated appears mitigated.

Read the full incident report →

Minor May 22, 2026

Possible delays in provisioning CMX OpenShift relating to upstream dependencies

Detected by Pingoru: May 22, 2026, 04:32 PM UTC
Resolved: May 22, 2026, 04:40 PM UTC
Duration: 7m

Affected: OpenShiftOKE - Alpha

Timeline · 2 updates

monitoring May 22, 2026, 04:32 PM UTC

There are possible delays when provisioning OpenShift on CMX. This relates to an ongoing incident with an upstream dependency. We are closely watching this and will update as conditions improve. Thank you for your patience.
resolved May 22, 2026, 04:40 PM UTC

We're observing now that the upstream dependency issue is resolved and that we're seeing significantly better performance when provisioning OpenShit on CMX. Thank you for your patience with this issue.

Read the full incident report →

Major May 19, 2026

Issues rendering and loading docs.replicated.com

Detected by Pingoru: May 19, 2026, 09:56 PM UTC
Resolved: May 19, 2026, 10:44 PM UTC
Duration: 48m

Affected: Netlify Standard Edge NetworkReplicated Docs

Timeline · 2 updates

monitoring May 19, 2026, 09:56 PM UTC

https://docs.replicated.com is currently impacted by an incident with an upstream provider. We are closely monitoring this issue and will provide updates here. Thank you for your patience in this matter.
resolved May 19, 2026, 10:44 PM UTC

We've seen consistent improvement to the loading and rendering of https://docs.replicated.com. Thank you for your patience through this issue.

Read the full incident report →

Notice May 10, 2026

kurl.sh down

Detected by Pingoru: May 10, 2026, 07:10 AM UTC
Resolved: May 10, 2026, 08:19 AM UTC
Duration: 1h 9m

Affected: kURL.sh

Timeline · 6 updates

investigating May 10, 2026, 07:10 AM UTC

We are currently investigating this issue.
investigating May 10, 2026, 07:25 AM UTC

We are aware of an outage affecting https://kurl.sh, which prevents access to the documentation and the website that allows building installers yaml. Access to existing installer manifests is unaffected.
investigating May 10, 2026, 07:27 AM UTC

We are continuing to investigate this issue.
investigating May 10, 2026, 08:17 AM UTC

The upstream outage that was affecting this service has been resolved.
investigating May 10, 2026, 08:18 AM UTC

We are continuing to investigate this issue.
resolved May 10, 2026, 08:19 AM UTC

This incident has been resolved.

Read the full incident report →

Major May 8, 2026

Issues with provisioning AKS in CMX

Detected by Pingoru: May 08, 2026, 10:21 PM UTC
Resolved: May 09, 2026, 12:34 AM UTC
Duration: 2h 12m

Affected: AKS

Timeline · 3 updates

investigating May 08, 2026, 10:21 PM UTC

We are currently investigating this issue.
monitoring May 08, 2026, 10:57 PM UTC

The issue looks to be resolved by azure and we're monitoring the issue.
resolved May 09, 2026, 12:34 AM UTC

This incident has been resolved.

Read the full incident report →

Minor May 5, 2026

Issues with provisioning kURL in CMX

Detected by Pingoru: May 05, 2026, 04:15 PM UTC
Resolved: May 06, 2026, 02:38 AM UTC
Duration: 10h 22m

Affected: kURL

Timeline · 6 updates

investigating May 05, 2026, 04:15 PM UTC

We are currently investigating this issue
investigating May 05, 2026, 05:02 PM UTC

We are continuing to investigate this issue and will provide periodic updates on the incident.
investigating May 05, 2026, 06:04 PM UTC

We are continuing to investigate this issue with provisioning kURL on CMX. We appreciate your patience.
investigating May 05, 2026, 08:41 PM UTC

We are continuing to investigate this issue with provisioning kURL on CMX. We appreciate your patience.
identified May 05, 2026, 08:47 PM UTC

We have identified the issue impacting kURL on CMX and are working on a fix.
resolved May 06, 2026, 02:38 AM UTC

This incident has been resolved. A fix was released to production.

Read the full incident report →

Notice April 24, 2026

CMX Azure AKS is degraded

Detected by Pingoru: Apr 24, 2026, 05:33 PM UTC
Resolved: Apr 25, 2026, 01:29 PM UTC
Duration: 19h 56m

Affected: AKS

Timeline · 2 updates

monitoring Apr 24, 2026, 05:33 PM UTC

CMX is impacted by an upstream incident for Azure AKS. We apologize for the inconvenience and are monitoring for improvements.
resolved Apr 25, 2026, 01:29 PM UTC

This incident has been resolved.

Read the full incident report →

Minor February 25, 2026

Issues with Compatibility Matrix

Detected by Pingoru: Feb 25, 2026, 09:12 PM UTC
Resolved: Feb 25, 2026, 10:01 PM UTC
Duration: 49m

Affected: AKSEKSGKEOpenShiftkURLk3skindEmbedded ClustersCoreOKE - AlphaRKE2

Timeline · 5 updates

investigating Feb 25, 2026, 09:12 PM UTC

We are currently investigating this issue.
identified Feb 25, 2026, 09:19 PM UTC

We believe this relates to a networking issue with an upstream provider which may be transient. We are exercising end to end tests at the moment to observe improvements internally.
monitoring Feb 25, 2026, 09:27 PM UTC

Internal end to end testing is showing successful operations with VMs and Clusters on CMX
monitoring Feb 25, 2026, 09:35 PM UTC

CMX operations appear to be working as expected. We are continuing to monitor this situation and the upstream provider.
resolved Feb 25, 2026, 10:01 PM UTC

We are seeing no signs of regression in networking performance or operations of CMX VMS and clusters. This incident is resolved.

Read the full incident report →

Major November 18, 2025

Replicated platform currently impacted by ongoing Cloudflare Major Incident

Detected by Pingoru: Nov 18, 2025, 11:57 AM UTC
Resolved: Nov 18, 2025, 03:38 PM UTC
Duration: 3h 40m

Affected: Cloudflare Cloudflare TunnelVendor PortalAKSEKSCloudflare Cloudflare WorkersGKEOpenShiftkURLk3skindEmbedded ClustersCoreOKE - AlphaRKE2Replicated RegistryEnterprise PortalReplicated Docs

Timeline · 8 updates

identified Nov 18, 2025, 11:57 AM UTC

Replicated platform is currently impacted by ongoing Cloudflare Major Incident (https://www.cloudflarestatus.com/incidents/8gmgl950y3h7)
identified Nov 18, 2025, 11:59 AM UTC

We are continuing to work on a fix for this issue.
identified Nov 18, 2025, 12:02 PM UTC

We are continuing to work on a fix for this issue.
identified Nov 18, 2025, 12:07 PM UTC

We are continuing to work on a fix for this issue.
monitoring Nov 18, 2025, 02:52 PM UTC

We are seeing some recovery from the ongoing Cloudflare incident. We are continuing to monitor.
monitoring Nov 18, 2025, 02:56 PM UTC

We are continuing to monitor for any further issues.
monitoring Nov 18, 2025, 03:37 PM UTC

Internal testing and evaluation of the Replicated platform components shows are are operational. Cloudflare is additionally noting that they believe the incident is resolved, though they are continuing to monitor. Should issues arise again we will keep our customers informed of conditions.
resolved Nov 18, 2025, 03:38 PM UTC

This incident has been resolved.

Read the full incident report →

Minor November 12, 2025

Possible delays in provisioning CMX resources

Detected by Pingoru: Nov 12, 2025, 04:00 PM UTC
Resolved: Nov 12, 2025, 04:09 PM UTC
Duration: 9m

Affected: AKSEKSGKEOpenShiftkURLk3skindEmbedded ClustersCoreOKE - AlphaRKE2

Timeline · 2 updates

investigating Nov 12, 2025, 04:00 PM UTC

We are currently investigating this issue.
resolved Nov 12, 2025, 04:09 PM UTC

We have observed a significant improvement in these resources following actions from an upstream providers networking. This incident is resolved at this time.

Read the full incident report →

Minor October 30, 2025

Increased HTTP 5xx rate for registry.replicated.com

Detected by Pingoru: Oct 30, 2025, 07:20 PM UTC
Resolved: Oct 30, 2025, 10:37 PM UTC
Duration: 3h 16m

Affected: Replicated Registry

Timeline · 4 updates

investigating Oct 30, 2025, 07:20 PM UTC

We are currently investigating this issue
identified Oct 30, 2025, 07:32 PM UTC

We have issued a rollback for a recent change for registry.replicated.com and are monitoring for improvement
monitoring Oct 30, 2025, 09:03 PM UTC

We are observing healthy responses from registry.replicated.com at this time. We have identified that error responses observed earlier related to unauthenticated pulls with specific versions of Helm. We are iterating on improvements to mitigate the cause of these problems before rolling back forward.
resolved Oct 30, 2025, 10:37 PM UTC

We have tested changes on the roll forward and responses from registry.replicated.com for Helm pulls continue to appear healthy. This incident is resolved.

Read the full incident report →

Minor October 20, 2025

Reports that updates within Vendor Portal are unsuccessful

Detected by Pingoru: Oct 20, 2025, 03:22 PM UTC
Resolved: Oct 20, 2025, 06:13 PM UTC
Duration: 2h 50m

Affected: Vendor Portal

Timeline · 4 updates

investigating Oct 20, 2025, 03:22 PM UTC

We are currently investigating this issue.
identified Oct 20, 2025, 04:22 PM UTC

The issue has been identified and a fix is being deployed.
monitoring Oct 20, 2025, 04:50 PM UTC

A fix has been rolled out, and we are monitoring the results. Vendors may need to refresh the webpage.
resolved Oct 20, 2025, 06:13 PM UTC

This incident has been resolved.

Read the full incident report →

Minor October 20, 2025

Compatibility Matrix clusters on EKS are impacted by ongoing incident with AWS

Detected by Pingoru: Oct 20, 2025, 02:07 PM UTC
Resolved: Oct 21, 2025, 12:42 AM UTC
Duration: 10h 35m

Affected: EKS

Timeline · 3 updates

monitoring Oct 20, 2025, 02:07 PM UTC

We have identified that provisioning new EKS clusters on CMX has a high failure rate. This appears correlate with an ongoing incident impacting multiple services at AWS. We are monitoring this situation closely.
monitoring Oct 20, 2025, 10:23 PM UTC

We are beginning to see recovery for EKS on Compatibility Matrix. We will continue to monitor this and update as conditions change.
resolved Oct 21, 2025, 12:42 AM UTC

AWS is stating at this time that their incident affecting us-east-1 is now resolved. Through careful monitoring and testing we observe that EKS on CMX is working once again

Read the full incident report →

Minor October 16, 2025

Increased rate of HTTP 500's reported from replicated.app

Detected by Pingoru: Oct 16, 2025, 09:28 PM UTC
Resolved: Oct 16, 2025, 09:55 PM UTC
Duration: 26m

Timeline · 4 updates

investigating Oct 16, 2025, 09:28 PM UTC

We are currently investigating causes of this issue
identified Oct 16, 2025, 09:31 PM UTC

We believe we have identified the source of the issue and are working to remediate
monitoring Oct 16, 2025, 09:36 PM UTC

We've issued a rollback for the affected platform component and are monitoring for full recovery.
resolved Oct 16, 2025, 09:55 PM UTC

Monitoring appears to show impact has been mitigated at this time.

Read the full incident report →

Minor September 26, 2025

Slow performance for VM based cluster installations

Detected by Pingoru: Sep 26, 2025, 05:05 PM UTC
Resolved: Sep 26, 2025, 10:10 PM UTC
Duration: 5h 4m

Affected: OpenShiftkURLk3skindEmbedded Clusters

Timeline · 4 updates

investigating Sep 26, 2025, 05:05 PM UTC

We're experiencing degraded network performance for Compatibility Matrix, and are currently looking into it
investigating Sep 26, 2025, 09:19 PM UTC

We are seeing network performance improvements now, but are still monitoring this issue.
monitoring Sep 26, 2025, 09:21 PM UTC

We are seeing network performance improvements now, but are still monitoring this issue.
resolved Sep 26, 2025, 10:10 PM UTC

This incident has been resolved.

Read the full incident report →

Minor September 22, 2025

Performance degradation experienced for EU based workloads involving Replicated platform components.

Detected by Pingoru: Sep 22, 2025, 08:57 PM UTC
Resolved: Sep 23, 2025, 03:29 PM UTC
Duration: 18h 32m

Affected: OpenShiftkURLk3skindEmbedded ClustersRKE2Replicated Registry

Timeline · 5 updates

investigating Sep 22, 2025, 08:57 PM UTC

We are currently investigating this issue.
monitoring Sep 22, 2025, 09:13 PM UTC

We're experiencing significant network performance issues from customers and workloads in EU regions when communicating with Replicated platform.
monitoring Sep 22, 2025, 09:16 PM UTC

We are continuing to monitor for any further issues.
monitoring Sep 22, 2025, 11:42 PM UTC

We are beginning to observe improved performance for downloads with workloads in the EU at this time. kURL and Embedded Cluster installations on CMX appear to be completing successfully.
resolved Sep 23, 2025, 03:29 PM UTC

We're continuing to observe consistent network performance and download success from EU regions communicating with Replicated platform components at this time.

Read the full incident report →

Critical September 19, 2025

Increased GKE creation failure rates in Compatibility Matrix

Detected by Pingoru: Sep 19, 2025, 07:53 PM UTC
Resolved: Sep 19, 2025, 09:40 PM UTC
Duration: 1h 47m

Affected: GKE

Timeline · 2 updates

monitoring Sep 19, 2025, 07:53 PM UTC

Creating GKE clusters through Compatibility Matrix appears to be impacted by an ongoing service disruption with GCP for GCE VMs in us-central. We are closely monitoring this incident with Google
resolved Sep 19, 2025, 09:40 PM UTC

Google Cloud states this service disruption has been resolved. We have validated this recovery in CMX and observe GKE clusters creating successfully at this time.

Read the full incident report →

Minor September 18, 2025

Increased HTTP 5xx rate for api.replicated.com

Detected by Pingoru: Sep 18, 2025, 03:48 PM UTC
Resolved: Sep 18, 2025, 06:42 PM UTC
Duration: 2h 53m

Affected: Vendor PortalReplicated Registry

Timeline · 3 updates

investigating Sep 18, 2025, 03:48 PM UTC

We currently investigating this issue
monitoring Sep 18, 2025, 05:02 PM UTC

After further review of the increase in HTTP 5xx errors, we've identified that these failed requests are generally coming through an upstream provider in a particular point of presence. We are monitoring ongoing issues with our upstream provider and will update here as conditions change.
resolved Sep 18, 2025, 06:42 PM UTC

Observed issues with an upstream provider have been resolved. The API and the Registry are behaving normally at this time.

Read the full incident report →

Minor August 21, 2025

Multiple services are degraded

Detected by Pingoru: Aug 21, 2025, 06:36 PM UTC
Resolved: Aug 21, 2025, 08:57 PM UTC
Duration: 2h 20m

Affected: Vendor PortalReplicated Registry

Timeline · 5 updates

investigating Aug 21, 2025, 06:36 PM UTC

We are investigating an issue where many services appear to have degraded performance or failed responses.
investigating Aug 21, 2025, 06:37 PM UTC

We are continuing to investigate this issue.
investigating Aug 21, 2025, 06:38 PM UTC

We suspect that this is related to an ongoing Cloudflare incident: https://www.cloudflarestatus.com/incidents/d9n3g1vnxdd2
monitoring Aug 21, 2025, 07:36 PM UTC

Services appear to be recovering. We will continue to monitor the issue.
resolved Aug 21, 2025, 08:57 PM UTC

The upstream incident has been resolved and we no longer see degradation.

Read the full incident report →

Minor July 15, 2025

Compatibility Matrix clusters are taking longer than expected to start

Detected by Pingoru: Jul 15, 2025, 11:35 PM UTC
Resolved: Jul 15, 2025, 11:45 PM UTC
Duration: 9m

Affected: AKSEKSGKEOpenShiftkURLk3skindEmbedded ClustersCoreOKE - AlphaRKE2

Timeline · 3 updates

investigating Jul 15, 2025, 11:35 PM UTC

We have identified that creating new VMs and clusters takes 10 mins or more. We are investigating the cause and will update this page when we have made some progress.
monitoring Jul 15, 2025, 11:40 PM UTC

We have mitigated the issue and are monitoring the situation to ensure that provisioning is stable.
resolved Jul 15, 2025, 11:45 PM UTC

Service has been returned to normal.

Read the full incident report →

Minor June 29, 2025

Some AKS instance types are unavailable

Detected by Pingoru: Jun 29, 2025, 09:47 PM UTC
Resolved: Jun 30, 2025, 02:29 AM UTC
Duration: 4h 41m

Affected: AKS

Timeline · 2 updates

identified Jun 29, 2025, 09:47 PM UTC

AKS instance types Standard_D2S_v5, Standard_D4S_v5, Standard_D8S_v5, Standard_D16S_v5, Standard_D32S_v5, and Standard_D48S_v5 are currently unavailable. We are working with Azure to correct this issue.
resolved Jun 30, 2025, 02:29 AM UTC

The incident has been resolved

Read the full incident report →

Minor June 12, 2025

High error rate from upstream cloud provider

Detected by Pingoru: Jun 12, 2025, 06:23 PM UTC
Resolved: Jun 12, 2025, 09:31 PM UTC
Duration: 3h 7m

Affected: GKEReplicated Registry

Timeline · 5 updates

monitoring Jun 12, 2025, 06:23 PM UTC

We are aware of upstream issues with Google Cloud Platform. This is presently impacting GKE components of our Compatibility Matrix product, and upstream GCR registries with our Replicated Registry. We are monitoring our cloud providers for updates.
monitoring Jun 12, 2025, 07:24 PM UTC

We are continuing to monitor upstream issues with Google Cloud Platform and Cloudflare
monitoring Jun 12, 2025, 09:29 PM UTC

We are continuing to monitor for any further issues.
monitoring Jun 12, 2025, 09:30 PM UTC

We are continuing to monitor for any further issues.
resolved Jun 12, 2025, 09:31 PM UTC

This incident has been resolved.

Read the full incident report →

Minor May 29, 2025

CMX: Network Issues when creating RKE2 clusters

Detected by Pingoru: May 29, 2025, 05:55 PM UTC
Resolved: Jun 09, 2025, 07:18 PM UTC
Duration: 11d 1h

Affected: RKE2

Timeline · 2 updates

investigating May 29, 2025, 05:55 PM UTC

We're aware of network issues when creating an RKE2 cluster and are investigating
resolved Jun 09, 2025, 07:18 PM UTC

This incident has been resolved.

Read the full incident report →

Minor March 31, 2025

Compatibility Matrix: GKE 1.32 clusters failing to create

Detected by Pingoru: Mar 31, 2025, 10:31 PM UTC
Resolved: Apr 01, 2025, 03:09 AM UTC
Duration: 4h 37m

Affected: GKE

Timeline · 5 updates

investigating Mar 31, 2025, 10:31 PM UTC

We have noticed that GKE 1.32 clusters are not reaching a 'running' state. We are currently investigating this issue.
identified Mar 31, 2025, 11:39 PM UTC

We have identified an issue with pulling images from Google's registry, and are working with Google to find a resolution.
monitoring Apr 01, 2025, 02:43 AM UTC

We have rolled out a fix, pinning GKE 1.32 to version 1.32.2, and are monitoring the results now.
monitoring Apr 01, 2025, 02:44 AM UTC

We are continuing to monitor for any further issues.
resolved Apr 01, 2025, 03:09 AM UTC

This incident has been resolved.

Read the full incident report →

Notice March 24, 2025

Email deliverability spam issue

Detected by Pingoru: Mar 24, 2025, 04:30 PM UTC
Resolved: Mar 25, 2025, 02:26 AM UTC
Duration: 9h 56m

Affected: Vendor Portal

Timeline · 3 updates

identified Mar 24, 2025, 04:30 PM UTC

Check your spam folder. Our transactional email provider is seeing deliverability problems with gmail. https://status.postmarkapp.com/notices/bt3ky3r8zlaapqlo-increased-gmail-spam-reports
monitoring Mar 24, 2025, 11:27 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Mar 25, 2025, 02:26 AM UTC

This incident has been resolved.

Read the full incident report →