Balena experienced a minor incident on June 5, 2025 affecting Application Builder, lasting 3h 20m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jun 05, 2025, 02:11 PM UTC
We're experiencing an elevated level of errors in our application builder infrastructure and are currently looking into the issue.
- identified Jun 05, 2025, 02:12 PM UTC
The issue has been identified and a fix is being implemented.
- monitoring Jun 05, 2025, 02:30 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Jun 05, 2025, 04:20 PM UTC
This incident has been resolved.
- postmortem Jun 05, 2025, 04:55 PM UTC
A [known issue](https://github.com/apocas/docker-modem/issues/168) in an upstream component results in an error and crash when too many layers are present on our cloud builders and we attempt to list the metadata for garbage collection. We have encountered this before, and in the past we lowered the threshold for garbage collection to avoid significant layer buildup. However, this does not address the root cause and it is still possible to exceed the metadata limit with many smaller layers in the local engine cache. To avoid a repeat of this incident, will revisit the upstream issue and open a pull request with a fix ourselves.