Ambra experienced a notice incident on September 7, 2023 affecting Web Services and Image Processing and 1 more component, lasting 18m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Sep 07, 2023, 06:17 PM UTC
We have identified a major issue affecting Ambra. Our Engineering teams have been mobilized to address the issue, and we will provide frequent updates to keep you informed.
- monitoring Sep 07, 2023, 06:21 PM UTC
Users are now able to log into Ambra. Our Engineering teams are actively monitoring, and investigating to identify the root cause.
- monitoring Sep 07, 2023, 06:23 PM UTC
We are continuing to monitor for any further issues.
- monitoring Sep 07, 2023, 06:34 PM UTC
We are continuing to monitor for any further issues.
- resolved Sep 07, 2023, 06:35 PM UTC
The incident has been fully resolved and service is back to normal levels. Our team will be conducting a root cause analysis and sharing as soon as possible. We will continue to monitor the situation to ensure there are no further issues.
- postmortem Sep 12, 2023, 06:55 PM UTC
The Ambra platform experienced a brief network interruption within our internal network which caused an automatic failover to a secondary network device. This caused application timeouts while connections were re-established across the new route. Our investigation determined that some servers retained connections that were no longer valid due to the routing change. These retained connections, combined with new and automatic reconnections overloaded the server and caused additional timeouts. After manually restarting services on the affected servers system functionality returned to normal. We will be adjusting system resources in order to better handle traffic spikes such as the one we experienced following this event.