Ambra experienced a minor incident on April 18, 2024 affecting Web Services and Image Processing and 1 more component, lasting 32m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Apr 18, 2024, 09:05 PM UTC
We have received reports of intermittent viewing issues on the Ambra platform. Engineering teams are currently investigating. Additional information will be provided as soon as it is available.
- resolved Apr 18, 2024, 09:38 PM UTC
The incident has been fully resolved by adding additional transcoding nodes. Service is back to normal levels. Our team will be conducting a root cause analysis and sharing as soon as possible. We will continue to monitor the situation to ensure there are no further issues.
- postmortem May 08, 2024, 03:27 PM UTC
We have pinpointed a challenge within our real-time transcoding and rendering service where the autoscaling logic was not functioning optimally, leading to memory shortages. This occurred as the system attempted to scale up to manage a higher workload, but the newly created instances were over utilizing resources, causing them to fail. To address this, we have refined our autoscaling process. This adjustment ensures that new instances will use the available memory more efficiently, allowing the system to scale horizontally and maintain performance without overburdening the infrastructure