Ambra experienced a major incident on February 1, 2024 affecting Web Services and Image Processing and 1 more component, lasting 21m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Feb 01, 2024, 01:47 PM UTC
We have identified a major issue affecting Ambra. Our Engineering teams have been mobilized to address the issue, and we will provide frequent updates to keep you informed.
- identified Feb 01, 2024, 02:03 PM UTC
The Engineering teams have identified an issue with Redis memory use. They have implemented a temporary setting that will mitigate the issue until the root cause is fixed. We will continue to post status updates as received.
- resolved Feb 01, 2024, 02:08 PM UTC
The incident has been fully resolved and service is back to normal levels. Our team will be conducting a root cause analysis and sharing as soon as possible. We will continue to monitor the situation to ensure there are no further issues.
- postmortem May 08, 2024, 08:51 PM UTC
An incorrectly configured webhook caused huge numbers of sessions to be created, overloading Ambra's session tracking. Once the cause was determined, we disabled the incorrect webhooks and forcibly expired the invalid sessions. Upon further investigation we determined that certain workflows can bypass Ambra's limits on the maximum number of sessions per user. R&D will investigate how to adjust these limits in order to prevent similar situations in the future.