Northflank Outage History

Northflank had 19 outages in the last 2 years totaling 1h 26m of downtime — averaging 0.8 incidents per month.

There were 19 Northflank outages since July 16, 2024 totaling 1h 26m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.northflank.com

Minor May 13, 2026

Europe - West (London): New resources not able to be created

Detected by Pingoru: May 13, 2026, 03:00 PM UTC
Resolved: May 13, 2026, 04:27 PM UTC
Duration: 1h 26m

Affected: Northflank Platform (Services)Northflank Platform (Addons)Northflank Platform (Jobs)

Timeline · 3 updates

investigating May 13, 2026, 03:00 PM UTC

We are currently investigating this incident. Existing running workloads are unaffected.
monitoring May 13, 2026, 03:33 PM UTC

We are seeing an improvement; workloads are starting to be created. Connectivity has been restored, and we are continuing to monitor the situation.
resolved May 13, 2026, 04:27 PM UTC

This incident has been resolved.

Read the full incident report →

Minor May 10, 2026

Active DDoS Attack in Europe - West - Netherlands

Detected by Pingoru: May 10, 2026, 06:30 PM UTC
Resolved: May 10, 2026, 06:30 PM UTC
Duration: —

Timeline · 1 update

resolved May 10, 2026, 06:30 PM UTC

Type: Incident Duration: 2 hours and 2 minutes Affected Components: Addons, Services May 10, 18:30:00 GMT+0 - Identified - We are currently experiencing a large DDoS attack in the Europe - West - Netherlands region. This is affecting public ingress into the cluster. Internal traffic is not affected. The team is working on additional mitigations. May 10, 20:10:07 GMT+0 - Monitoring - We have put in place mitigations and are monitoring the affected region. May 10, 20:31:56 GMT+0 - Resolved - This incident has been resolved.

Read the full incident report →

Minor May 8, 2026

Let's Encrypt Certificate Generation Issues

Detected by Pingoru: May 08, 2026, 07:00 PM UTC
Resolved: May 08, 2026, 07:00 PM UTC
Duration: —

Timeline · 1 update

resolved May 08, 2026, 07:00 PM UTC

Type: Incident Duration: 2 hours and 36 minutes Affected Components: Certificates May 8, 19:00:00 GMT+0 - Identified - [Let's Encrypt is currently experiencing a partial service disruption](https://letsencrypt.status.io/) leading to certificate generation delays and failures on the Northflank platform. This will impact provisioning of new addons, domains and BYOC clusters. May 8, 21:36:09 GMT+0 - Resolved - Certificate generation is functioning normally again. We will continue to monitor the situation.

Read the full incident report →

Minor April 8, 2026

Addon and volume availability issues in US - Central

Detected by Pingoru: Apr 08, 2026, 06:15 AM UTC
Resolved: Apr 08, 2026, 06:15 AM UTC
Duration: —

Timeline · 1 update

resolved Apr 08, 2026, 06:15 AM UTC

Type: Incident Duration: 52 minutes Affected Components: Addons, Services Apr 8, 06:15:00 GMT+0 - Identified - We are aware of an issue with a node affecting some stateful workloads in US - Central. We are working on a fix for this incident. Apr 8, 07:07:27 GMT+0 - Resolved - We have recovered the node, and all workloads are now running again.

Read the full incident report →

Minor March 16, 2026

App UI is experiencing loading issues

Detected by Pingoru: Mar 16, 2026, 05:42 PM UTC
Resolved: Mar 16, 2026, 05:42 PM UTC
Duration: —

Timeline · 1 update

resolved Mar 16, 2026, 05:42 PM UTC

Type: Incident Duration: 34 minutes Affected Components: Northflank App Mar 16, 18:16:28 GMT+0 - Resolved - This incident has been resolved. Mar 16, 18:15:58 GMT+0 - Monitoring - We identified the root cause as a system component causing disproportionate database load. This is affecting read performance leading to degraded performance for the app UI. We have implemented a fix and are monitoring the situation. Mar 16, 17:42:54 GMT+0 - Identified - We are investing an incident where the UI is failing to load correctly.

Read the full incident report →

Minor March 9, 2026

Degraded performance in US Central region

Detected by Pingoru: Mar 09, 2026, 12:06 PM UTC
Resolved: Mar 09, 2026, 12:06 PM UTC
Duration: —

Timeline · 1 update

resolved Mar 09, 2026, 12:06 PM UTC

Type: Incident Duration: 29 minutes Affected Components: Networking, Addons, Services Mar 9, 12:06:32 GMT+0 - Investigating - We are currently investigating this incident. Mar 9, 12:35:56 GMT+0 - Resolved - The root cause has been identified as an issue with the GCP infrastructure control plane and has been resolved. There was minimal impact to running user workloads. The main impact was a delay in provisioning net new or redeploying existing workloads.

Read the full incident report →

Minor February 19, 2026

Issues with builds using the Heroku 24 Buildpack

Detected by Pingoru: Feb 19, 2026, 03:00 PM UTC
Resolved: Feb 19, 2026, 03:00 PM UTC
Duration: —

Timeline · 1 update

resolved Feb 19, 2026, 03:00 PM UTC

Type: Incident Duration: 1 hour and 34 minutes Affected Components: Builds Feb 19, 15:00:00 GMT+0 - Identified - We are aware of an issue where builds fail to start when using the Heroku 24 buildpack image. We are currently working on identifying a solution. We will provide an update when the fix has been released. Feb 19, 16:34:09 GMT+0 - Resolved - We have released a fix for the issue.

Read the full incident report →

Minor December 11, 2025

Node infrastructure stability - London Region

Detected by Pingoru: Dec 11, 2025, 01:54 PM UTC
Resolved: Dec 11, 2025, 01:54 PM UTC
Duration: —

Timeline · 1 update

resolved Dec 11, 2025, 01:54 PM UTC

Type: Incident Duration: 39 minutes Affected Components: Networking, , Addons, Jobs, Services, Northflank Platform → Dec 11, 13:54:32 GMT+0 - Identified - There are currently issues with node stability in the London region leading to partial outages for some workloads. The team has identified the issue and is implementing a mitigation. Dec 11, 14:13:28 GMT+0 - Monitoring - The mitigation is in place and the region is operating as expected. Dec 11, 14:33:03 GMT+0 - Resolved - The incident has been resolved and the cause identified. A node image security release lead to the host filesystem going into a read only mode when interacting with a specific set of workloads causing the node to become unresponsive. We have implemented a mitigation and are working on a permanent solution.

Read the full incident report →

Minor September 24, 2025

Docker Hub is experiencing issues

Detected by Pingoru: Sep 24, 2025, 11:53 PM UTC
Resolved: Sep 24, 2025, 11:53 PM UTC
Duration: —

Timeline · 1 update

resolved Sep 24, 2025, 11:53 PM UTC

Type: Incident Duration: 1 hour and 44 minutes Affected Components: Addons, Jobs, Services Sep 24, 23:53:50 GMT+0 - Identified - Dockerhub is currently experiencing issues related to authentication. This may cause issues with starting jobs, addons and services. Sep 25, 01:11:08 GMT+0 - Monitoring - We are observing that image pull requests for Docker Hub images are now succeeding. Sep 25, 01:38:09 GMT+0 - Resolved - Dockerhub is fully operational and image pulling is now working as expected.

Read the full incident report →

Minor September 16, 2025

Delay in DNS propagation

Detected by Pingoru: Sep 16, 2025, 03:09 PM UTC
Resolved: Sep 16, 2025, 03:09 PM UTC
Duration: —

Timeline · 1 update

resolved Sep 16, 2025, 03:09 PM UTC

Type: Incident Duration: 18 hours and 29 minutes Affected Components: Addons, Services, Certificates Sep 16, 15:09:51 GMT+0 - Monitoring - Our DNS provider NS1 is experiencing delays in DNS propagation. This is affecting service and addon creation. NS1 status page: Sep 17, 09:39:01 GMT+0 - Resolved - The incident has been resolved on NS1 and we have not been seeing any more related failures over the last hour.

Read the full incident report →

Minor August 25, 2025

Issue affecting local disk caching for builds running on Northflank's infrastructure

Detected by Pingoru: Aug 25, 2025, 09:48 PM UTC
Resolved: Aug 25, 2025, 09:48 PM UTC
Duration: —

Timeline · 1 update

resolved Aug 25, 2025, 09:48 PM UTC

Type: Incident Duration: 1 hour and 17 minutes Affected Components: Builds Aug 25, 21:48:19 GMT+0 - Identified - There is an issue with the underlying storage which provides the local disk-based caching feature for builds on Northflank's infrastructure. BYOC builds and builds without a local cache will be unaffected. We are currently working on a fix for this. Aug 25, 22:45:24 GMT+0 - Monitoring - We implemented a fix and are currently monitoring the result. Aug 25, 23:04:54 GMT+0 - Resolved - This incident has been resolved.

Read the full incident report →

Major August 5, 2025

Log & Metric Ingestion and Query Outage

Detected by Pingoru: Aug 05, 2025, 07:23 AM UTC
Resolved: Aug 05, 2025, 07:23 AM UTC
Duration: —

Timeline · 1 update

resolved Aug 05, 2025, 07:23 AM UTC

Type: Incident Duration: 1 hour Affected Components: Logs and Metrics Aug 5, 07:23:59 GMT+0 - Investigating - We are currently investigating this incident. Aug 5, 07:55:18 GMT+0 - Monitoring - We implemented a fix and are currently monitoring the result. Aug 5, 08:23:56 GMT+0 - Resolved - This incident has been resolved.

Read the full incident report →

Minor June 12, 2025

Issues with Google Cloud platform - Affecting multiple services

Detected by Pingoru: Jun 12, 2025, 06:00 PM UTC
Resolved: Jun 12, 2025, 06:00 PM UTC
Duration: —

Timeline · 1 update

Read the full incident report →

Major June 11, 2025

Metrics ingestion and query outage

Detected by Pingoru: Jun 11, 2025, 10:00 PM UTC
Resolved: Jun 11, 2025, 10:00 PM UTC
Duration: —

Timeline · 1 update

resolved Jun 11, 2025, 10:00 PM UTC

Type: Incident Duration: 2 hours and 17 minutes Affected Components: Logs and Metrics Jun 11, 22:00:00 GMT+0 - Identified - Metrics query and ingestion are currently affected. Logs are unaffected. We are currently applying a fix for the issue and monitoring the results. Jun 11, 23:16:34 GMT+0 - Monitoring - The fix has been rolled out and we are currently monitoring the ingestion rate and catch-up. Jun 12, 00:16:31 GMT+0 - Resolved - The ingestion rate has been stable for 30 minutes and historic writes have caught up. We sincerely apologise for the service disruption you may have experienced. We are investigating the root cause to prevent similar issues in the future.

Read the full incident report →

Minor May 21, 2025

Infrastructure stability issues

Detected by Pingoru: May 21, 2025, 04:04 PM UTC
Resolved: May 21, 2025, 04:04 PM UTC
Duration: —

Timeline · 1 update

Read the full incident report →

Minor November 8, 2024

US - Central - Inability to Create Disk Snapshots

Detected by Pingoru: Nov 08, 2024, 06:02 PM UTC
Resolved: Nov 08, 2024, 06:02 PM UTC
Duration: —

Timeline · 1 update

resolved Nov 08, 2024, 06:02 PM UTC

Type: Incident Duration: 1 hour and 17 minutes Affected Components: Addons Nov 8, 19:20:22 GMT+0 - Resolved - This incident has been resolved. Nov 8, 18:02:57 GMT+0 - Monitoring - Google Cloud is experiencing issues with snapshot upload in the US Central region. This is causing delays for addon snapshot backups within the region

Read the full incident report →

Minor October 11, 2024

DNS Propagation Delay

Detected by Pingoru: Oct 11, 2024, 06:41 PM UTC
Resolved: Oct 11, 2024, 06:41 PM UTC
Duration: —

Timeline · 1 update

resolved Oct 11, 2024, 06:41 PM UTC

Type: Incident Duration: 1 hour and 27 minutes Affected Components: Services, Addons, Certificates Oct 11, 18:41:37 GMT+0 - Monitoring - Our DNS provider is experiencing issues with DNS propagation leading to delays in provisioning new workloads and certificates. Existing workloads are unaffected. Oct 11, 20:09:04 GMT+0 - Resolved - This incident has been resolved.

Read the full incident report →

Major July 16, 2024

Metrics ingestion partial outage

Detected by Pingoru: Jul 16, 2024, 10:46 AM UTC
Resolved: Jul 16, 2024, 10:46 AM UTC
Duration: —

Timeline · 1 update

resolved Jul 16, 2024, 10:46 AM UTC

Type: Incident Affected Components: Logs and Metrics Jul 16, 10:46:00 GMT+0 - Resolved - Metrics experienced ingestion issues between 10:46AM -> 11:30AM UTC We implemented a fix and are currently monitoring the result.

Read the full incident report →

Major July 16, 2024

Partial Logs Outage

Detected by Pingoru: Jul 16, 2024, 04:00 AM UTC
Resolved: Jul 16, 2024, 04:00 AM UTC
Duration: —

Timeline · 1 update

resolved Jul 16, 2024, 04:00 AM UTC

Type: Incident Affected Components: Logs and Metrics Jul 16, 04:00:00 GMT+0 - Resolved - Engineers identified an incident with Northflank's logging backend between 0400UTC and 0700UTC. The problem has been resolved. The impact is log loss between this timeframe. We are still investigating the cause of the issue and will keep you updated and whether logs can be recovered.

Read the full incident report →