Evidos incident

Transaction in status progress

Notice Resolved View vendor source →

Evidos experienced a notice incident on July 9, 2025 affecting API and Portal, lasting 1h 30m. The incident has been resolved; the full update timeline is below.

Started
Jul 09, 2025, 06:59 PM UTC
Resolved
Jul 09, 2025, 08:30 PM UTC
Duration
1h 30m
Detected by Pingoru
Jul 09, 2025, 06:59 PM UTC

Affected components

APIPortal

Update timeline

  1. investigating Jul 09, 2025, 06:59 PM UTC

    We are currently investigating this issue.

  2. investigating Jul 09, 2025, 07:11 PM UTC

    We are continuing to investigate this issue.

  3. identified Jul 09, 2025, 07:49 PM UTC

    The issue has been identified and a fix is being implemented.

  4. monitoring Jul 09, 2025, 08:01 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Jul 09, 2025, 08:30 PM UTC

    This incident has been resolved.

  6. postmortem Jul 11, 2025, 03:01 PM UTC

    ### What happened? On July 9, 2025, between 20:59 and 22:01 CEST, our platform experienced a service disruption that impacted availability. During this time, customers were unable to access the platform, and transactions were temporarily halted. ### What we did Our team noticed an increase in specific errors on the platform and immediately began investigating the root cause. We observed that queues were building up rapidly, and the platform was not processing the signing of transactions as expected. To prevent further degradation and ensure data integrity, we proactively placed the platform into maintenance mode. During this window, we identified a misconfiguration in our new containerized environment. The platform was running solely on one document signing service due to this configuration error. Although a second document signing service was active, it was not receiving any traffic. Once the issue was identified, we corrected the configuration and restored the affected server. The API was restored and operational within approximately 15 minutes, as there were no issues with creating transactions. However, access to the portal and signing transactions remained unavailable. The entire platform was fully brought back online only after the queues had been processed and stability was verified. ### What caused the issue? The root cause was a configuration error introduced during our transition to a containerized environment. Due to a misconfiguration, only one document signing service was actively handling traffic. When that server failed, the second server—although healthy—was not utilized, leading to a service outage. ### What are we doing next? To prevent similar issues from happening again and to ensure service continuity, we are taking the following steps: 1. Audit on configuration: Conducting a thorough audit of our containerized environment configuration to identify and correct any potential misconfigurations. 2. Prominent validation checks: Adding this scenario as a key validation checkpoint in our containerization rollout process to ensure traffic routing and failover mechanisms are correctly configured. ‌ Moving to containerized architecture is a critical part of our broader cloud migration strategy. Our ultimate vision is to fully transition to Entrust’s EU hosting infrastructure. This migration will allow us to leverage high-availability capabilities and failover mechanisms to ensure service continuity. Additionally, it will provide a scalable and resilient platform that can meet evolving customer needs while adhering to strict regional compliance requirements to ensure that data resides in designated jurisdictions. We expect to finalize the migration to a fully containerized environment this quarter.