Balena incident

Elevated Application Registry Errors

Minor Resolved View vendor source →

Balena experienced a minor incident on July 25, 2024 affecting Application Builder and Application Registry and 1 more component, lasting 5d 20h. The incident has been resolved; the full update timeline is below.

Started
Jul 25, 2024, 04:57 PM UTC
Resolved
Jul 31, 2024, 01:25 PM UTC
Duration
5d 20h
Detected by Pingoru
Jul 25, 2024, 04:57 PM UTC

Affected components

Application BuilderApplication RegistryDelta Image Downloads

Update timeline

  1. investigating Jul 25, 2024, 04:57 PM UTC

    We're experiencing degraded performance around creating new releases on our builders and are currently looking into the issue. Users experiencing issues creating releases using the `balena push` command, are advised to temporarily switch to local builds via `balena deploy`, if possible, until the incident is resolved.

  2. investigating Jul 25, 2024, 08:31 PM UTC

    We are continuing to investigate this issue.

  3. investigating Jul 25, 2024, 11:59 PM UTC

    We are continuing to investigate this issue.

  4. investigating Jul 26, 2024, 12:13 PM UTC

    We are continuing to investigate this issue.

  5. monitoring Jul 26, 2024, 03:13 PM UTC

    A fix has been implemented and we are monitoring the results.

  6. resolved Jul 31, 2024, 01:25 PM UTC

    This incident has been resolved.

  7. postmortem Jul 31, 2024, 01:25 PM UTC

    We experienced timeouts for deltas and builds when pushing images to our registry hosted in the US East \(N. Virginia\) region. This issue impacted our cloud builders in Finland and Germany, among other regions. The root cause was identified as a public routing issue between certain regions, affecting the ability of some of our systems to access the registry efficiently. We resolved the issue by enabling proxied routing protocols for our registry endpoint. This allowed us to bypass the impacted network paths and restore normal operations. ## Impact * Cloud builders in Finland and Germany experienced delays in image pushing * Potential delays in deployment pipelines for affected regions * No data loss or security breaches occurred