Red Hat Outage History

Red Hat is up right now

Red Hat had 60 outages in the last 2 years totaling 219h 23m of downtime — averaging 2.5 incidents per month.

There were 60 Red Hat outages since August 19, 2025 totaling 219h 23m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.redhat.com

Major October 20, 2025

Cascading failures for ROSA and OSD services that depend on Quay and AWS

Detected by Pingoru
Oct 20, 2025, 08:45 AM UTC
Resolved
Oct 20, 2025, 10:29 PM UTC
Duration
13h 43m
Affected: OpenShift Cluster Manager
Timeline · 8 updates
  1. monitoring Oct 20, 2025, 08:45 AM UTC

    Due to an ongoing Quay and AWS incident ROSA and OSD clusters may face degradations of on-cluster services as well issues during installation. Red Hat is actively monitoring the situation and will provide updates as we become aware of them.

  2. monitoring Oct 20, 2025, 09:28 AM UTC

    The impact of the incident is currently limited to the AWS us-east-1 region.

  3. monitoring Oct 20, 2025, 11:18 AM UTC

    The AWS incident has been updated from Degraded to Impacting. We are still seeing impact to the ROSA and OSD services in us-east-1 region, mostly related to EC2 instance launches. We are continuing to monitor the incident. Currently no customer actions are required. We will update the incident within 1 hour.

  4. monitoring Oct 20, 2025, 12:05 PM UTC

    According to the newest update from AWS there are ongoing issues with VM launches. We're observing launch errors in the us-east-1 region which affects ROSA and OSD products. The status hasn't changed, no action is required from the customers.

  5. monitoring Oct 20, 2025, 01:03 PM UTC

    According to the newest AWS update mitigations of the EC2 instance launch issue are ongoing. The incident is still ongoing. No customer action is required. Red Hat is monitoring the ongoing incident.

  6. monitoring Oct 20, 2025, 06:59 PM UTC

    Per AWS's most recent update at 11:22 AM PDT, multiple AWS services affecting compute and networking remain down or degraded. This may impact cluster operations, including cluster creation, image pulls, upgrades, and more. Please see https://health.aws.amazon.com/health/status for detailed and direct updates of these underlying services.

  7. monitoring Oct 20, 2025, 08:36 PM UTC

    Per AWS's most recent update at 1:03 PM PDT, multiple AWS services affecting compute and networking are continuing to see an improvement. However, there may still be impact to cluster operations, including cluster creation, image pulls, upgrades, and more. Please see https://health.aws.amazon.com/health/status for detailed and direct updates of these underlying services.

  8. resolved Oct 20, 2025, 10:29 PM UTC

    Per AWS's most recent update at 2:48 PM PDT, EC2 instance creation is no longer being throttled, and has returned to normal pre-incident levels. We have seen recovery of clusters affected by this outage, and are resolving this incident. If you are experiencing any issues, please reach out to Support.

Read the full incident report →

Major October 20, 2025

Quay.io reporting errors - Container Registry is currently in read-only mode

Detected by Pingoru
Oct 20, 2025, 08:10 AM UTC
Resolved
Oct 20, 2025, 11:05 PM UTC
Duration
14h 54m
Affected: registry.redhat.ioRegistryFrontend
Timeline · 4 updates
  1. investigating Oct 20, 2025, 08:10 AM UTC

    Quay.io is currently reporting errors - service is currently in read only mode

  2. investigating Oct 20, 2025, 12:02 PM UTC

    The team is currently monitoring the situation to make an informed decision on re-enabling write access to quay.io.

  3. identified Oct 20, 2025, 10:52 PM UTC

    Per AWS's most recent update, the majority of services have been restored. We have begun the process of re-enabling Pushes.

  4. resolved Oct 20, 2025, 11:05 PM UTC

    All services have been restored.

Read the full incident report →

Major October 2, 2025

Connect.redhat.com - Outage

Detected by Pingoru
Oct 02, 2025, 04:57 PM UTC
Resolved
Oct 02, 2025, 04:59 PM UTC
Duration
1m
Affected: Partner Container Certification WorkflowPartner Acceleration DeskPartner Account ManagementPartner PortalPartner Subscriptions
Timeline · 2 updates
  1. identified Oct 02, 2025, 04:57 PM UTC

    connect.redhat.com is currently unavailable due to an underlying service issue. Our team is actively working to resolve the problem. We appreciate your patience and will provide updates as they become available.

  2. resolved Oct 02, 2025, 04:59 PM UTC

    This incident has been resolved.

Read the full incident report →

Major September 25, 2025

Quay.io UI Reporting Internal Error

Detected by Pingoru
Sep 25, 2025, 01:20 PM UTC
Resolved
Sep 25, 2025, 08:10 PM UTC
Duration
6h 49m
Affected: FrontendSecurity Scanning
Timeline · 4 updates
  1. identified Sep 25, 2025, 01:20 PM UTC

    The quay.io web UI is currently reporting errors. Image push/pull and builds continue to work. We have identified the issue and are rolling out a fix.

  2. identified Sep 25, 2025, 01:27 PM UTC

    We are continuing to work on a fix for this issue.

  3. monitoring Sep 25, 2025, 01:38 PM UTC

    A fix has been implemented however we will need to keep Security Scanning disabled for the time being. We will re-enable it as soon as possible.

  4. resolved Sep 25, 2025, 08:10 PM UTC

    The incident has been resolved and Security Scanning is now fully enabled.

Read the full incident report →

Minor September 24, 2025

Red Hat 3scale API Management SaaS Admin and Developer Portal UIs is currently experiencing a service disruption

Detected by Pingoru
Sep 24, 2025, 01:36 PM UTC
Resolved
Sep 24, 2025, 02:01 PM UTC
Duration
25m
Timeline · 2 updates
  1. investigating Sep 24, 2025, 01:36 PM UTC

    The Red Hat 3scale Site Reliability Engineering team is conducting a root cause analysis and taking corrective actions. Updates and additional information will be provided as soon as they become available, and efforts will continue until full resolution is achieved.

  2. resolved Sep 24, 2025, 02:01 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor September 19, 2025

Red Hat 3scale API Management SaaS APIs are currently experiencing a service disruption

Detected by Pingoru
Sep 19, 2025, 09:06 AM UTC
Resolved
Sep 19, 2025, 10:10 AM UTC
Duration
1h 3m
Affected: API Manager
Timeline · 3 updates
  1. investigating Sep 19, 2025, 09:06 AM UTC

    The Red Hat 3scale Site Reliability Engineering team is conducting a root cause analysis and taking corrective actions. Updates and additional information will be provided as soon as they become available, and efforts will continue until full resolution is achieved.

  2. monitoring Sep 19, 2025, 09:20 AM UTC

    Sustained period of increased service latency.

  3. resolved Sep 19, 2025, 10:10 AM UTC

    This incident has been resolved.

Read the full incident report →

Major September 9, 2025

Hybrid Cloud Console partial frontend outage

Detected by Pingoru
Sep 09, 2025, 07:40 PM UTC
Resolved
Sep 09, 2025, 09:20 PM UTC
Duration
1h 39m
Affected: console.redhat.com
Timeline · 2 updates
  1. identified Sep 09, 2025, 07:40 PM UTC

    We are aware of a few Hybrid Cloud Console applications which are unexpectedly serving 403 errors. We've identified they are not correctly requesting necessary OAuth scopes and investigating a fix.

  2. resolved Sep 09, 2025, 09:20 PM UTC

    We have fixed the issues contributing to incorrect oauth scopes being requested for some applications. The Hybrid Cloud Console is back to normal operation.

Read the full incident report →

Critical September 2, 2025

Cloud-Connector Outage

Detected by Pingoru
Sep 02, 2025, 01:36 PM UTC
Resolved
Sep 02, 2025, 03:32 PM UTC
Duration
1h 55m
Affected: Red Hat Lightspeed - AdvisorRed Hat Lightspeed - Remediations
Timeline · 3 updates
  1. investigating Sep 02, 2025, 01:36 PM UTC

    Cloud-Connector is experiencing a outage due to DNS issues. This may impact the ability perform remote Playbook runs.

  2. investigating Sep 02, 2025, 02:51 PM UTC

    Remediations & Advisor/Tasks services impacted. New RHC connections may fail, affecting the ability to run playbooks and tasks.

  3. resolved Sep 02, 2025, 03:32 PM UTC

    Akamai & HarperDB have addressed the configuration issues that led to this outage. End to end tests confirm this has been resolved.

Read the full incident report →

Critical August 29, 2025

OCM API degraded

Detected by Pingoru
Aug 29, 2025, 12:26 PM UTC
Resolved
Aug 29, 2025, 02:32 PM UTC
Duration
2h 5m
Affected: api.openshift.com
Timeline · 3 updates
  1. investigating Aug 29, 2025, 12:26 PM UTC

    OCM upgrade API returning 500 responses.

  2. monitoring Aug 29, 2025, 02:19 PM UTC

    Latency and error rate have decreased. Transitioning to monitoring for sustained stability.

  3. resolved Aug 29, 2025, 02:32 PM UTC

    Sustained stability observed.

Read the full incident report →