Balena Outage History

Balena had 14 outages in the last 2 years totaling 612h 37m of downtime — averaging 0.6 incidents per month.

There were 14 Balena outages since June 5, 2025 totaling 612h 37m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.balena.io

Major May 5, 2026

Elevated API Errors

Detected by Pingoru: May 05, 2026, 06:12 PM UTC
Resolved: May 05, 2026, 09:03 PM UTC
Duration: 2h 51m

Affected: APIApplication BuilderDelta Image Downloads

Timeline · 5 updates

investigating May 05, 2026, 06:12 PM UTC

We're experiencing an elevated level of API errors and are currently looking into the issue.
identified May 05, 2026, 06:19 PM UTC

The issue has been identified and a fix is being implemented.
monitoring May 05, 2026, 06:43 PM UTC

A fix has been implemented and we are monitoring the results.
resolved May 05, 2026, 09:03 PM UTC

This incident has been resolved.
postmortem May 05, 2026, 09:18 PM UTC

A vulnerability mitigation update required replacing compute cluster nodes, which, when applied, rolled back due to a timeout. This caused some workloads \(API, delta, builder\) to become temporarily unavailable and triggering some undesired secondary effects, including ungracefully terminating a few long lived instances servicing VPN connections. While the rest of the services came back reasonably quickly within a minute or so, it too around one and a quarter hours to re-establish VPN tunnels. A scheduled maintenance will be posted later to perform this update during a planned outage window.

Read the full incident report →

Minor March 31, 2026

Elevated GIT/Application Builder Errors

Detected by Pingoru: Mar 31, 2026, 12:55 PM UTC
Resolved: Apr 21, 2026, 04:30 PM UTC
Duration: 21d 3h

Affected: Application Builder

Timeline · 5 updates

identified Mar 31, 2026, 12:55 PM UTC

We're experiencing an elevated level of errors in our application builder infrastructure and are currently looking into the issue.
monitoring Apr 08, 2026, 07:51 PM UTC

A fix has been implemented and we are monitoring the results.
monitoring Apr 08, 2026, 07:51 PM UTC

We are continuing to monitor for any further issues.
resolved Apr 21, 2026, 04:30 PM UTC

This incident has been resolved.
postmortem Apr 21, 2026, 05:06 PM UTC

Starting around March 11, some cloud builds began failing intermittently with no such image errors. The failures were non-deterministic and affected all architectures. At peak, some users saw around 50% failure rates. We identified and fixed several bugs in the builder's image garbage collector that caused it to over-count freed disk space and run too aggressively, eventually deleting images that in-progress builds still needed. Fixes were deployed between March 19 and April 14, with build failure rates dropping to near-zero after the final deploy. We're continuing to monitor and working on additional safeguards to prevent the garbage collector from targeting images that active builds depend on.

Read the full incident report →

Minor March 23, 2026

Builder Degraded performance

Detected by Pingoru: Mar 23, 2026, 04:57 PM UTC
Resolved: Mar 25, 2026, 01:18 PM UTC
Duration: 1d 20h

Affected: Application Builder

Timeline · 4 updates

investigating Mar 23, 2026, 04:57 PM UTC

We are seeing several builds intermitently failing with 404 errors - No such image during builds and are investigating.
monitoring Mar 23, 2026, 06:55 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Mar 25, 2026, 01:18 PM UTC

This incident has been resolved.
postmortem Mar 25, 2026, 03:38 PM UTC

Between March 11 and March 25, some cloud builds experienced intermittent failures with "no such image" errors. The issue was non-deterministic and did not affect all builds. We've identified a likely contributing factor and deployed mitigations that have stabilized build reliability. We're continuing to investigate the underlying cause to prevent recurrence. If you experienced build failures during this window, re-running your build should succeed. We appreciate your patience while we worked through this, and we apologize for the disruption.

Read the full incident report →

Minor March 3, 2026

Elevated Dashboard Errors

Detected by Pingoru: Mar 03, 2026, 05:44 PM UTC
Resolved: Mar 03, 2026, 08:17 PM UTC
Duration: 2h 33m

Affected: Dashboard

Timeline · 4 updates

investigating Mar 03, 2026, 05:44 PM UTC

We're experiencing an issue where the Dashboard may redirect to an unexpected page on initial load, which can prevent access to certain account and billing pages.
monitoring Mar 03, 2026, 07:45 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Mar 03, 2026, 08:17 PM UTC

This incident has been resolved.
postmortem Mar 09, 2026, 01:01 PM UTC

We identified an issue in Dashboard v32.2.0, released on March 2, 2026, where opening the dashboard via a direct link to certain pages \(such as billing or other account management pages\) could result in being unexpectedly redirected to the fleets overview. This was caused by a race condition in our access control logic that made a routing decision before all authorization data had finished loading. The issue was resolved on March 3, 2026 with a fix that ensures the dashboard waits for all access information to be available before determining whether a user can view a page. We understand this was frustrating, particularly for users trying to manage billing or account settings via bookmarked or shared links. We apologize for the disruption and are adding test coverage for direct-link navigation to prevent similar regressions in the future.

Read the full incident report →

Minor February 26, 2026

Elevated Device SSH Errors

Detected by Pingoru: Feb 26, 2026, 09:13 PM UTC
Resolved: Feb 27, 2026, 03:39 AM UTC
Duration: 6h 25m

Affected: SSH proxy

Timeline · 6 updates

Read the full incident report →

Major February 10, 2026

Elevated Delta Errors

Detected by Pingoru: Feb 10, 2026, 09:33 AM UTC
Resolved: Feb 11, 2026, 12:02 AM UTC
Duration: 14h 28m

Affected: Delta Image Downloads

Timeline · 4 updates

investigating Feb 10, 2026, 09:33 AM UTC

Some delta generation requests are encountering errors and failing. We are currently investigating this issue.
monitoring Feb 10, 2026, 10:25 AM UTC

We have identified the potential cause and have rolled back the changes.
resolved Feb 11, 2026, 12:02 AM UTC

This incident has been resolved.
postmortem Feb 11, 2026, 12:10 AM UTC

v2 delta generation service experienced failures from ~21:15 UTC Feb 9 to ~10:00 UTC Feb 10, 2026, due to a missing configuration dependency during a logic change. **Impact:** * v2 delta generation requests failed to complete * No data loss or security impact **Root Cause:** Recent logic changes were deployed without the required accompanying configuration update, preventing the service from completing v2 delta requests. **Resolution:** The logic changes were rolled back, restoring the service to its previous stable state. **Follow-up Actions:** * Prepare and deploy the permanent fix We apologize for the disruption and any inconvenience this caused. We are committed to improving our processes to prevent similar issues in the future.

Read the full incident report →

Critical February 10, 2026

Elevated Cloudlink Errors

Detected by Pingoru: Feb 10, 2026, 04:05 AM UTC
Resolved: Feb 10, 2026, 07:56 AM UTC
Duration: 3h 51m

Affected: Cloudlink (VPN)

Timeline · 5 updates

investigating Feb 10, 2026, 04:05 AM UTC

We're experiencing an elevated level of errors in our Cloudlink infrastructure and are currently looking into the issue.
identified Feb 10, 2026, 06:39 AM UTC

The issue has been identified and a fix is being implemented.
monitoring Feb 10, 2026, 07:17 AM UTC

A fix has been implemented and we are monitoring the results.
resolved Feb 10, 2026, 07:56 AM UTC

This incident has been resolved.
postmortem Feb 10, 2026, 11:52 AM UTC

Balena devices were unable to connect to Cloudlink on February 10, 2026, from approximately 02:26 GMT to 07:11 GMT due to an expired server certificate. Devices that were already connected to Cloudlink were unaffected unless the connection was terminated. **Root Cause:** The Cloudlink servers were using an expired certificate that was due for replacement. Consequently, incoming Cloudlink connections failed with a certificate verification error. **Resolution:** The certificate has been replaced, and Cloudlink servers were restarted to use the new certificate. Balena devices are expected to reconnect to Cloudlink within a few minutes after being disconnected due to the restart. **Follow-up Actions:** * Expand certificate expiry monitoring coverage to include all active certificates * Automate the certificate renewal process for Cloudlink We apologize for any disruption this caused and appreciate your patience as we continue improving our processes and operations.

Read the full incident report →

Major January 28, 2026

Partial API Outage

Detected by Pingoru: Jan 28, 2026, 01:40 AM UTC
Resolved: Jan 28, 2026, 01:00 AM UTC
Duration: —

Timeline · 2 updates

Read the full incident report →

Minor December 23, 2025

Degraded Performance

Detected by Pingoru: Dec 23, 2025, 04:32 PM UTC
Resolved: Dec 24, 2025, 09:38 AM UTC
Duration: 17h 6m

Affected: APIApplication Builderbalenahub

Timeline · 3 updates

investigating Dec 23, 2025, 04:32 PM UTC

We are currently investigating an issue affecting the availability of balenaCloud services.
monitoring Dec 23, 2025, 07:03 PM UTC

Scaling issues during service deployment caused by unavailability of nodes from underlying scaling provider.
resolved Dec 24, 2025, 09:38 AM UTC

Insufficient AWS compute capacity overloaded the remaining nodes. This high load caused readiness probes to fail, triggering API restarts that created a feedback loop of increasing pressure.

Read the full incident report →

Critical December 5, 2025

An upstream provider outage is affecting connectivity to balenaCloud services

Detected by Pingoru: Dec 05, 2025, 09:05 AM UTC
Resolved: Dec 05, 2025, 09:37 AM UTC
Duration: 31m

Affected: APIDashboardWebsite

Timeline · 3 updates

identified Dec 05, 2025, 09:05 AM UTC

CloudFlare, our proxy provider, is having service issues. Connectivity to balenaCloud services are currently affected.
monitoring Dec 05, 2025, 09:16 AM UTC

Our upstream provider has implemented some fixes. balenaCloud services are back online. We are still monitoring the situation.
resolved Dec 05, 2025, 09:37 AM UTC

This incident has been resolved.

Read the full incident report →

Minor November 12, 2025

Builder partially degraded service

Detected by Pingoru: Nov 12, 2025, 09:13 AM UTC
Resolved: Nov 12, 2025, 07:58 PM UTC
Duration: 10h 45m

Affected: Application Builder

Timeline · 5 updates

Read the full incident report →

Notice October 1, 2025

Elevated Device URLs/VPN Errors

Detected by Pingoru: Oct 01, 2025, 02:51 PM UTC
Resolved: Oct 01, 2025, 01:30 PM UTC
Duration: —

Timeline · 1 update

resolved Oct 01, 2025, 02:51 PM UTC

An update to our kube-system infrastructure resulted in a disruptive pod rollout that left some devices disconnected from Cloudlink for up to 10 minutes. We apologize for the interruption and future updates to this component will be handled within planned maintenance windows.

Read the full incident report →

Major September 30, 2025

Elevated Device VPN Tunnel Errors

Detected by Pingoru: Sep 30, 2025, 07:47 PM UTC
Resolved: Sep 30, 2025, 09:56 PM UTC
Duration: 2h 8m

Affected: Cloudlink (VPN)

Timeline · 5 updates

Read the full incident report →

Notice July 7, 2025

Degraded API performance

Detected by Pingoru: Jul 07, 2025, 10:40 AM UTC
Resolved: Jul 07, 2025, 10:00 AM UTC
Duration: —

Timeline · 1 update

resolved Jul 07, 2025, 10:40 AM UTC

Temporarily degraded performance of API response processing.

Read the full incident report →