Amazee Outage History

Amazee degraded · 2 active incidents View live status →

Amazee had 39 outages in the last 2 years totaling 3609h 23m of downtime — averaging 1.6 incidents per month.

There were 39 Amazee outages since September 10, 2024 totaling 3609h 23m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.amazee.io

Notice June 1, 2026

Delayed log ingestion

Detected by Pingoru: Jun 01, 2026, 02:07 PM UTC
Resolved: Jun 01, 2026, 05:33 PM UTC
Duration: 3h 25m

Affected: Lagoon Logs (Kibana)

Timeline · 2 updates

monitoring Jun 01, 2026, 02:07 PM UTC

There is a delay in log processing and recent logs for events generated since June 1st may not be available. Real time logs can be access through the Lagoon CLI: https://docs.amazee.io/cloud/logging/#real-time-container-logs-via-lagoon-cli
resolved Jun 01, 2026, 05:33 PM UTC

This incident has been resolved.

Read the full incident report →

Minor May 7, 2026

Degraded: cert-manager unavailable - TLS certificate issuance affected

Detected by Pingoru: May 07, 2026, 06:08 PM UTC
Resolved: May 07, 2026, 09:14 PM UTC
Duration: 3h 6m

Affected: jp1.lagoonca1.lagoonie1.lagoonDeployment Infrastructurede3.lagoonfi2.lagoonau2.lagoonus2.lagoonuk3.lagoonus3.lagoonAMERAPACEMEAch4.lagoonamazee.ai infrastructure: au103amazee.ai infrastructure: de-eu101amazee.ai infrastructure: de103amazee.ai infrastructure: uk103amazee.ai infrastructure: us103amazee.ai infrastructure: us104amazee.ai infrastructure: ch103amazee.ai infrastructure

Timeline · 3 updates

identified May 07, 2026, 06:08 PM UTC

We are currently experiencing an issue with cert-manager in affected clusters. The cert-manager controller pod is failing to start due to an image pull failure from quay.io (502 Bad Gateway), which is an external registry. Impact: New TLS certificates cannot be issued or renewed while cert-manager is unavailable. Existing certificates remain valid until their expiry. Services relying on automatic certificate provisioning may be affected if certificates expire during this window. https://status.redhat.com/
monitoring May 07, 2026, 06:24 PM UTC

All clusters are able to pull cert-manager image, we are currently monitoring statuspage https://status.redhat.com/
resolved May 07, 2026, 09:14 PM UTC

This incident has been resolved.

Read the full incident report →

Minor April 30, 2026

CVE-2026-31431 / Copy Fail

Detected by Pingoru: Apr 30, 2026, 01:33 AM UTC
Resolved: May 13, 2026, 06:01 AM UTC
Duration: 13d 4h

Affected: jp1.lagoonca1.lagoonie1.lagoonde3.lagoonfi2.lagoonau2.lagoonus2.lagoonuk3.lagoonus3.lagoonAMERAPACEMEAch4.lagoon

Timeline · 2 updates

identified Apr 30, 2026, 01:33 AM UTC

We are aware of CVE-2026-31431 / "Copy Fail" and have completed an initial assessment. We have not identified evidence of customer impact, service compromise, or active exploitation at this stage. We are prioritizing Linux kernel upgrades and will provide further updates if our assessment changes or if maintenance becomes necessary.
resolved May 13, 2026, 06:01 AM UTC

Mitigations at the Linux kernel level have been rolled out across all regions.

Read the full incident report →

Minor April 29, 2026

High error rate on Fastly Management API

Detected by Pingoru: Apr 29, 2026, 09:19 AM UTC
Resolved: Apr 29, 2026, 12:51 PM UTC
Duration: 3h 31m

Affected: CDN

Timeline · 4 updates

investigating Apr 29, 2026, 09:19 AM UTC

We are currently investigating a high error rate on the Fastly management API. This impacts deployments of new domains that are routed through Fastly. Existing setups on Fastly or traffic routing are not impacted.
investigating Apr 29, 2026, 09:58 AM UTC

Fastly is investigating the API issues and updating their status page at: https://www.fastlystatus.com/incident/378511 Impact: Deployments with new domains that are routed through Fastly will continue to work, but the actual domain setup in Fastly will be delayed.
monitoring Apr 29, 2026, 11:15 AM UTC

Fastly resolved the issue with their management API and we'll keep monitoring the situation.
resolved Apr 29, 2026, 12:51 PM UTC

This incident has been resolved.

Read the full incident report →

Notice April 21, 2026

Degraded Performance

Detected by Pingoru: Apr 21, 2026, 12:09 PM UTC
Resolved: Apr 21, 2026, 01:03 PM UTC
Duration: 53m

Affected: de3.lagoonuk3.lagoon

Timeline · 2 updates

investigating Apr 21, 2026, 12:09 PM UTC

We are currently investigating this issue.
resolved Apr 21, 2026, 01:03 PM UTC

This incident has been resolved.

Read the full incident report →

Notice April 15, 2026

Post-maintenance Rollout

Detected by Pingoru: Apr 15, 2026, 04:19 AM UTC
Resolved: Apr 15, 2026, 07:24 AM UTC
Duration: 3h 5m

Affected: de3.lagoon

Timeline · 3 updates

identified Apr 15, 2026, 04:19 AM UTC

There was an issue with an EKS rollout during maintenance, we are actively addressing this issue.
monitoring Apr 15, 2026, 06:16 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Apr 15, 2026, 07:24 AM UTC

This incident has been resolved.

Read the full incident report →

Notice April 7, 2026

Increased Build Failures

Detected by Pingoru: Apr 07, 2026, 04:18 PM UTC
Resolved: Apr 08, 2026, 09:54 PM UTC
Duration: 1d 5h

Affected: Deployment Infrastructure

Timeline · 3 updates

identified Apr 07, 2026, 04:18 PM UTC

We have identified an issue causing Lagoon builds to fail and are actively working to resolve.
monitoring Apr 07, 2026, 05:27 PM UTC

We have applied a fix for this issue and are monitoring deployments.
resolved Apr 08, 2026, 09:54 PM UTC

The incident has been resolved.

Read the full incident report →

Minor March 18, 2026

Increased Lagoon API error rate

Detected by Pingoru: Mar 18, 2026, 01:41 PM UTC
Resolved: Mar 19, 2026, 02:52 PM UTC
Duration: 1d 1h

Affected: Lagoon APILagoon Dashboard

Timeline · 3 updates

investigating Mar 18, 2026, 01:41 PM UTC

We are investigating increased error rates on the Lagoon API.
identified Mar 19, 2026, 03:27 AM UTC

The issue has been identified and a fix is being implemented.
resolved Mar 19, 2026, 02:52 PM UTC

This incident has been resolved.

Read the full incident report →

Critical March 17, 2026

Lagoon API slow or degraded

Detected by Pingoru: Mar 17, 2026, 06:42 AM UTC
Resolved: Mar 17, 2026, 08:48 PM UTC
Duration: 14h 5m

Affected: Lagoon API

Timeline · 4 updates

investigating Mar 17, 2026, 06:42 AM UTC

We are currently investigating this issue.
identified Mar 17, 2026, 07:45 AM UTC

The issue has been identified and a fix is being implemented.
monitoring Mar 17, 2026, 07:45 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Mar 17, 2026, 08:48 PM UTC

This incident has been resolved.

Read the full incident report →

Notice February 18, 2026

Changed processing of HTTP 425 responses in Google Chrome

Detected by Pingoru: Feb 18, 2026, 07:01 AM UTC
Resolved: Apr 08, 2026, 08:10 AM UTC
Duration: 49d 1h

Affected: CDN

Timeline · 2 updates

monitoring Feb 18, 2026, 07:01 AM UTC

In the latest release of Google Chrome (v145) the handling of HTTP 425 responses was changed. Should you see this affecting your sites, please reach out to our support and we'll look into applying mitigations for your specific use-case. Related status page from Fastly: https://www.fastlystatus.com/incident/378300#
resolved Apr 08, 2026, 08:10 AM UTC

This incident has been resolved.

Read the full incident report →

Minor February 9, 2026

Elevated Errors on Fastly level

Detected by Pingoru: Feb 09, 2026, 07:38 PM UTC
Resolved: Feb 10, 2026, 07:52 AM UTC
Duration: 12h 13m

Affected: CDN

Timeline · 2 updates

monitoring Feb 09, 2026, 07:38 PM UTC

Fastly has reported elevated errors affecting Fastly Applications. Existing certificates do not appear to be impacted; however, some services are experiencing issues when adding new domains. We are monitoring the Fastly status page: https://www.fastlystatus.com/
resolved Feb 10, 2026, 07:52 AM UTC

This incident has been resolved.

Read the full incident report →

Notice January 30, 2026

Intermittent SSH access issues

Detected by Pingoru: Jan 30, 2026, 02:59 PM UTC
Resolved: Feb 02, 2026, 02:29 PM UTC
Duration: 2d 23h

Affected: Lagoon API

Timeline · 3 updates

investigating Jan 30, 2026, 02:59 PM UTC

We're investigating intermittent SSH access to cloud clusters
monitoring Jan 30, 2026, 03:58 PM UTC

We have identified and mitigated the cause for the SSH access issues, and are currently monitoring the portal.
resolved Feb 02, 2026, 02:29 PM UTC

This incident has been resolved.

Read the full incident report →

Notice January 28, 2026

HTTP Connection Errors

Detected by Pingoru: Jan 28, 2026, 08:25 PM UTC
Resolved: Jan 29, 2026, 06:45 AM UTC
Duration: 10h 20m

Affected: us2.lagoon

Timeline · 2 updates

monitoring Jan 28, 2026, 08:25 PM UTC

We found an HTTP connection errors issue. The underlying issue was resolved, and we are actively monitoring the status.
resolved Jan 29, 2026, 06:45 AM UTC

This incident has been resolved.

Read the full incident report →

Minor January 12, 2026

Increase in failed deployments

Detected by Pingoru: Jan 12, 2026, 11:05 AM UTC
Resolved: Jan 13, 2026, 07:08 AM UTC
Duration: 20h 2m

Affected: Deployment Infrastructure

Timeline · 3 updates

investigating Jan 12, 2026, 11:05 AM UTC

We have seen an increase in failed deployments and are currently investigating the cause for this.
identified Jan 12, 2026, 11:33 AM UTC

The issue has been identified and a fix is being implemented.
resolved Jan 13, 2026, 07:08 AM UTC

This incident has been resolved.

Read the full incident report →

Notice December 5, 2025

Public statement on CVE-2025-55182

Detected by Pingoru: Dec 05, 2025, 01:08 AM UTC
Resolved: Feb 18, 2026, 06:59 AM UTC
Duration: 75d 5h

Timeline · 4 updates

Read the full incident report →

Major December 1, 2025

Logging partially unavailable

Detected by Pingoru: Dec 01, 2025, 07:06 AM UTC
Resolved: Dec 03, 2025, 06:22 AM UTC
Duration: 1d 23h

Affected: Lagoon Logs (Kibana)

Timeline · 3 updates

identified Dec 01, 2025, 07:06 AM UTC

The logging infrastructure is rebalancing due to the new month starting which causes some requests to fail.
monitoring Dec 02, 2025, 08:19 AM UTC

There is a backlog of logs to be ingested into the system. While this is happening in the background, some recent logs might not be visible in the logging system right away.
resolved Dec 03, 2025, 06:22 AM UTC

The backlog has been processed and the logging system is fully available again.

Read the full incident report →

Major November 16, 2025

CH4: Workloads unable to start

Detected by Pingoru: Nov 16, 2025, 05:32 PM UTC
Resolved: Nov 16, 2025, 11:21 PM UTC
Duration: 5h 49m

Affected: ch4.lagoon

Timeline · 4 updates

investigating Nov 16, 2025, 05:32 PM UTC

New workloads on CH4 are unable to start. Deployments are expected to fail currently.
investigating Nov 16, 2025, 06:54 PM UTC

We have escalated the issue to the infrastructure provider and are working with them to resolve the issue.
monitoring Nov 16, 2025, 09:16 PM UTC

We applied a fix which restored all functionality around the starting of workloads and doing deployments.
resolved Nov 16, 2025, 11:21 PM UTC

This incident has been resolved.

Read the full incident report →

Major November 14, 2025

Logging inaccessible

Detected by Pingoru: Nov 14, 2025, 07:56 AM UTC
Resolved: Nov 14, 2025, 11:40 AM UTC
Duration: 3h 44m

Affected: Lagoon Logs (Kibana)

Timeline · 3 updates

investigating Nov 14, 2025, 07:56 AM UTC

We are currently investigating this issue.
identified Nov 14, 2025, 08:18 AM UTC

There was an unexpected rise in load on the logging system. We are now stabilizing the different components.
resolved Nov 14, 2025, 11:40 AM UTC

This incident has been resolved.

Read the full incident report →

Minor November 3, 2025

Connection interruption for CH4 database

Detected by Pingoru: Nov 03, 2025, 11:27 AM UTC
Resolved: Nov 03, 2025, 02:11 PM UTC
Duration: 2h 44m

Affected: ch4.lagoon

Timeline · 3 updates

investigating Nov 03, 2025, 11:27 AM UTC

We have received multiple reports of connection interruptions when using the production databases on CH4. We are investigating the cause of these interruptions.
monitoring Nov 03, 2025, 11:44 AM UTC

Since these issues are related to elevated resource usage we are doing an emergency resource increase. We expect a short interruption for existing connection which will help to prevent further disruptions.
resolved Nov 03, 2025, 02:11 PM UTC

This incident has been resolved.

Read the full incident report →

Major November 3, 2025

Logs access degraded

Detected by Pingoru: Nov 03, 2025, 06:24 AM UTC
Resolved: Nov 04, 2025, 06:46 AM UTC
Duration: 1d

Affected: Lagoon Logs (Kibana)

Timeline · 3 updates

investigating Nov 03, 2025, 06:24 AM UTC

We have identified a problem with logs ingestion and access. Engineers are investigating. We will provide an update within the next 90 minutes.
investigating Nov 03, 2025, 07:52 AM UTC

The logging system is working through a backlog of ingested logs that need to be allocated and indexed. While the backlog is being worked on, more data will become available over time. As the indexing is a resource intensive task, some requests can occasionally result in an error.
resolved Nov 04, 2025, 06:46 AM UTC

Due to the large backlog of logs, some log messages from 2025-11-01 to 2025-11-03 could not be loaded into the new logging system and had to be dropped. These logs are however still available in the old logging system at https://kibana-legacy.amazeeio.cloud.

Read the full incident report →

Minor October 20, 2025

Limited workload scaling and deployments

Detected by Pingoru: Oct 20, 2025, 09:34 AM UTC
Resolved: Oct 20, 2025, 02:02 PM UTC
Duration: 4h 27m

Affected: jp1.lagoonca1.lagoonie1.lagoonDeployment Infrastructurede3.lagoonfi2.lagoonau2.lagoonus2.lagoonuk3.lagoonus3.lagoonAMERAPACEMEAch4.lagoon

Timeline · 3 updates

monitoring Oct 20, 2025, 09:34 AM UTC

Due to an upstream issue within AWS some workloads could not be scaled or deployed. The issue also affects our support system which is only partially available.
monitoring Oct 20, 2025, 11:17 AM UTC

Some workloads in us2.lagoon are blocked from starting up due to the upstream issue. Other regions are no longer affected by this.
resolved Oct 20, 2025, 02:02 PM UTC

We are no longer seeing workloads affected by this issue.

Read the full incident report →

Critical September 25, 2025

Lagoon API Outage

Detected by Pingoru: Sep 25, 2025, 12:28 AM UTC
Resolved: Sep 25, 2025, 01:34 AM UTC
Duration: 1h 6m

Affected: Lagoon APILagoon Dashboard

Timeline · 3 updates

investigating Sep 25, 2025, 12:28 AM UTC

The Lagoon UI and API are currently experiencing an issue, we're investigating.
monitoring Sep 25, 2025, 01:05 AM UTC

We experienced an issue with image pulls during maintenance that resulted in the prolonged outage. This has been resolved now and we will continue to monitor.
resolved Sep 25, 2025, 01:34 AM UTC

This incident has been resolved.

Read the full incident report →

Major September 22, 2025

Unresponsive workloads on UK3

Detected by Pingoru: Sep 22, 2025, 11:50 AM UTC
Resolved: Sep 22, 2025, 03:05 PM UTC
Duration: 3h 15m

Affected: uk3.lagoon

Timeline · 3 updates

investigating Sep 22, 2025, 11:50 AM UTC

Some workloads on UK3 are not responding as expected due to high load across a part of the compute infrastructure.
monitoring Sep 22, 2025, 12:01 PM UTC

The situation has been stabilized and we continue to monitor the system closely
resolved Sep 22, 2025, 03:05 PM UTC

This incident has been resolved.

Read the full incident report →

Major July 16, 2025

Delayed log processing

Detected by Pingoru: Jul 16, 2025, 02:55 PM UTC
Resolved: Jul 17, 2025, 08:52 AM UTC
Duration: 17h 56m

Affected: Lagoon APILagoon Logs (Kibana)

Timeline · 5 updates

investigating Jul 16, 2025, 02:55 PM UTC

There is a delay in log processing and recent logs are not available in the logging system. Real time logs can be access through the Lagoon CLI: https://docs.amazee.io/cloud/logging/#real-time-container-logs-via-lagoon-cli
identified Jul 16, 2025, 04:02 PM UTC

We are rolling out a potential solution which can cause connectivity issues with the Lagoon API.
monitoring Jul 16, 2025, 04:14 PM UTC

While the logs are being processed the logging system can be unavailable at times.
monitoring Jul 17, 2025, 04:59 AM UTC

The processing of some older logs is still ongoing.
resolved Jul 17, 2025, 08:52 AM UTC

The historical logs have been processed and are available in the logging stack.

Read the full incident report →

Notice July 8, 2025

DNS issues

Detected by Pingoru: Jul 08, 2025, 11:31 AM UTC
Resolved: Jul 08, 2025, 01:41 PM UTC
Duration: 2h 10m

Affected: Deployment InfrastructureNameservers

Timeline · 3 updates

identified Jul 08, 2025, 11:31 AM UTC

The issue has been identified and a fix is being implemented.
monitoring Jul 08, 2025, 12:24 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Jul 08, 2025, 01:41 PM UTC

During the rollout of enabling DNS-01 challenges in our infrastructure, the image registry was unavailable due to temporary DNS misconfiguration.

Read the full incident report →