Camunda incident
Degraded system availability in Zeebe 8.5.4 (Camunda 8.5+gen6)
Camunda experienced a minor incident on July 3, 2024 affecting Zeebe, lasting 3h 53m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Jul 03, 2024, 02:26 PM UTC
Our team has discovered a bug in Zeebe version 8.5.4, which could result in one broker becoming unavailable on restart, resulting in degraded system performance. This bug is present in SaaS (Generation 8.5+gen6) as well as Self-Managed. The upgrade path to SaaS generation 8.5+gen6 was already disabled. Other versions are not affected. Our development team is actively working on a resolution, and we will notify you as soon as patches containing the fix are available. If you are running Self-Managed, please do not update to Zeebe version 8.5.4. If you are running SaaS, and have not yet updated to 8.5+gen6, you are not affected. If you are running SaaS on 8.5+gen6, we will auto-update you to version 8.5+gen7 in the course of today. Affected Versions: 8.5+gen6, Zeebe 8.5.4
- resolved Jul 03, 2024, 06:19 PM UTC
The issue has been resolved. The snapshot integrity check is now performed correctly and does not produce false positives anymore. Note that this incident did not result in data loss or corruption, as only the comparison was invalid, flagging snapshots are corrupted when they were not, which would result in nodes not starting properly. Only users on 8.5-gen6 were possibly affected by this, and all clusters have been auto-updated. We apologize for any inconvenience caused and thank you for your patience.