Ambra incident

Ambra Platform Unavailable

Major Resolved View vendor source →

Ambra experienced a major incident on February 1, 2024 affecting Web Services and Image Processing and 1 more component, lasting 21m. The incident has been resolved; the full update timeline is below.

Started
Feb 01, 2024, 01:47 PM UTC
Resolved
Feb 01, 2024, 02:08 PM UTC
Duration
21m
Detected by Pingoru
Feb 01, 2024, 01:47 PM UTC

Affected components

Web ServicesImage ProcessingImage Viewing

Update timeline

  1. investigating Feb 01, 2024, 01:47 PM UTC

    We have identified a major issue affecting Ambra. Our Engineering teams have been mobilized to address the issue, and we will provide frequent updates to keep you informed.

  2. identified Feb 01, 2024, 02:03 PM UTC

    The Engineering teams have identified an issue with Redis memory use. They have implemented a temporary setting that will mitigate the issue until the root cause is fixed. We will continue to post status updates as received.

  3. resolved Feb 01, 2024, 02:08 PM UTC

    The incident has been fully resolved and service is back to normal levels. Our team will be conducting a root cause analysis and sharing as soon as possible. We will continue to monitor the situation to ensure there are no further issues.

  4. postmortem May 08, 2024, 08:51 PM UTC

    An incorrectly configured webhook caused huge numbers of sessions to be created, overloading Ambra's session tracking. Once the cause was determined, we disabled the incorrect webhooks and forcibly expired the invalid sessions. Upon further investigation we determined that certain workflows can bypass Ambra's limits on the maximum number of sessions per user. R&D will investigate how to adjust these limits in order to prevent similar situations in the future.