Balena incident

Elevated API Errors

Major Resolved View vendor source →

Balena experienced a major incident on March 3, 2025 affecting API, lasting 5h 37m. The incident has been resolved; the full update timeline is below.

Started
Mar 03, 2025, 06:59 PM UTC
Resolved
Mar 04, 2025, 12:36 AM UTC
Duration
5h 37m
Detected by Pingoru
Mar 03, 2025, 06:59 PM UTC

Affected components

API

Update timeline

  1. investigating Mar 03, 2025, 06:59 PM UTC

    We're experiencing an elevated level of API errors and are currently looking into the issue.

  2. identified Mar 03, 2025, 07:48 PM UTC

    The issue has been identified and a fix is being implemented.

  3. identified Mar 03, 2025, 08:46 PM UTC

    The issue has been identified and a fix is being implemented.

  4. monitoring Mar 03, 2025, 09:03 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Mar 04, 2025, 12:36 AM UTC

    This incident has been resolved.

  6. postmortem Mar 04, 2025, 12:37 AM UTC

    An internal observability feature led to unreasonable base memory footprint for API instances under production load, leading to frequent evictions. For now, we’ve rolled back to a previous API version to restore stability, while we investigate the root cause.