Ambra experienced a major incident on November 6, 2024 affecting Web Services and Image Processing and 1 more component, lasting 17h 24m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Nov 06, 2024, 12:21 AM UTC
We have received reports of issues on the Ambra platform related to Gateways. Engineering teams are currently investigating. Additional information will be provided as soon as it is available.
- investigating Nov 06, 2024, 12:50 AM UTC
Our Engineering teams are actively investigating and working to identify the root cause. We understand the urgency and we appreciate your patience as we work to address the issue.
- investigating Nov 06, 2024, 01:37 AM UTC
This evening a gateway update that only impacted gateway logging configuration was pushed out to customers. In some cases the package was not fully applied which left some Gateway devices in a state which they couldn't function. Please note this is a package installation issue only. If you would like to perform the fix yourself, please see the KB0031682 - https://serviceportal.intelerad.com/csm?id=kb_article_view&sysparm_article=KB0031682. Please call support at 888-315-0790 if you need assistance with the installation.
- investigating Nov 06, 2024, 02:19 AM UTC
We continue to work to identify the root cause of the install issue. In the interium we have identified how to manually fix the install issue. If you would like to perform the fix yourself, please see the KB0031682 in the ISP - https://serviceportal.intelerad.com/csm?id=kb_article_view&sysparm_article=KB0031682 or call support at 888-315-0790 if you need assistance with the new installation.
- investigating Nov 06, 2024, 02:56 AM UTC
We are continuing to investigate this issue.
- investigating Nov 06, 2024, 03:32 AM UTC
We are continuing to investigate this issue.
- investigating Nov 06, 2024, 04:21 AM UTC
We continue to work to identify the root cause of the install issue. In the interim we have identified how to manually fix the install issue. If you would like to perform the fix yourself, please see the KB0031682 in the ISP - https://serviceportal.intelerad.com/csm?id=kb_article_view&sysparm_article=KB0031682 or call support at 888-315-0790 if you need assistance with the new installation. The next update will be in 3 hours.
- investigating Nov 06, 2024, 06:42 AM UTC
We continue to work to identify the root cause of the install issue. In the interim we have identified how to manually fix the install issue. If you would like to perform the fix yourself, please see the KB0031682 in the ISP - https://serviceportal.intelerad.com/csm?id=kb_article_view&sysparm_article=KB0031682 or call support at 888-315-0790 if you need assistance with the new installation.
- investigating Nov 06, 2024, 09:50 AM UTC
We continue to work to identify the root cause of the install issue. In the interim we have identified how to manually fix the install issue. If you would like to perform the fix yourself, please see the KB0031682 in the ISP - https://serviceportal.intelerad.com/csm?id=kb_article_view&sysparm_article=KB0031682 or call support at 888-315-0790 if you need assistance with the new installation.
- investigating Nov 06, 2024, 12:50 PM UTC
We continue to work to identify the root cause of the install issue. In the interim we have identified how to manually fix the install issue. If you would like to perform the fix yourself, please see the KB0031682 in the ISP - https://serviceportal.intelerad.com/csm?id=kb_article_view&sysparm_article=KB0031682 or call support at 888-315-0790 if you need assistance with the new installation.
- resolved Nov 06, 2024, 05:46 PM UTC
The incident has been closed. Our team will be conducting a root cause analysis. We will continue to monitor the situation to ensure there are no further issues.
- postmortem Dec 12, 2024, 10:26 PM UTC
Issue: After the recent InteleShare release, a small percent of transcoding requests began failing intermittently, in a way that caused the thread that was handling that request to enter an infinite loop waiting for additional data that would never arrive. As these stuck threads accumulated, their combined resources eventually led to performance degradation, and finally to errors or timeouts when the maximum thread pool limit was reached. Root Cause: The recent release of InteleShare included updates to a client library used for internal network communication. The new library improved overall performance, but had different timeout behavior which could sometimes cause slow connections to be closed but without passing the error through to other components. Resolution: We have adjusted our configuration settings so that the new library behaves similarly to the previous library and the system is now stable.