Ambra incident

InteleShare Incident

Major Resolved View vendor source →

Ambra experienced a major incident on April 10, 2025 affecting Web Services and Image Processing and 1 more component, lasting 2h 9m. The incident has been resolved; the full update timeline is below.

Started
Apr 10, 2025, 03:55 PM UTC
Resolved
Apr 10, 2025, 06:04 PM UTC
Duration
2h 9m
Detected by Pingoru
Apr 10, 2025, 03:55 PM UTC

Affected components

Web ServicesImage ProcessingImage Viewing

Update timeline

  1. investigating Apr 10, 2025, 03:55 PM UTC

    We have received reports of issues on the InteleShare platform. Engineering teams are currently investigating. Additional information will be provided as soon as it is available.

  2. investigating Apr 10, 2025, 04:23 PM UTC

    At this time, our platform is experiencing a high system load. Users may experience overall slowness revolving around sending/receiving, loading of images, and UI navigation. Our engineering teams have not yet identified the root cause. We are continuing to investigate and will provide further updates as soon as possible.

  3. monitoring Apr 10, 2025, 04:54 PM UTC

    The root cause has been identified, a remediation to mitigate the issue has been applied. Overall system performance has improved at this time. We will continue to monitor the situation to ensure there are no further issues and send additional updates on any new developments.

  4. resolved Apr 10, 2025, 06:04 PM UTC

    The incident has been fully resolved and service is back to normal levels. Our team will be conducting a root cause analysis and sharing as soon as possible. We will continue to monitor the situation to ensure there are no further issues.

  5. postmortem Apr 14, 2025, 08:04 PM UTC

    Our investigation has revealed that a scheduled task, which was designed to incrementally process items since its last run, was erroneously attempting to process the entire dataset each time it was executed. This misconfiguration led to a cumulative impact on system performance, resulting in errors and timeouts during both interactive sessions and API calls. Upon discovery, we immediately disabled the task to prevent further performance degradation. We are committed to ensuring the reliability of our platform and are exploring the need for this functionality. We will reimplement the task using current APIs that adhere to a more standardized workflow. In addition to addressing the immediate issue, our operations and research and development \(R&D\) teams have identified opportunities for enhancing the overall performance of the InteleShare platform. These improvements are scheduled for implementation in upcoming releases and are part of our ongoing commitment to providing you with a robust and efficient platform. We apologize for any inconvenience this may have caused and appreciate your understanding as we continuously work to improve our services.