Voucherify incident

API connectivity issues

Major Resolved View vendor source →

Voucherify experienced a major incident on December 12, 2019 affecting EU1 - API, lasting 6h 14m. The incident has been resolved; the full update timeline is below.

Started
Dec 12, 2019, 05:07 AM UTC
Resolved
Dec 12, 2019, 11:21 AM UTC
Duration
6h 14m
Detected by Pingoru
Dec 12, 2019, 05:07 AM UTC

Affected components

EU1 - API

Update timeline

  1. investigating Dec 12, 2019, 05:07 AM UTC

    Communication with API is currently being impacted by the anomalous behavior of one of our European clusters. W are investigating the issue.

  2. identified Dec 12, 2019, 05:39 AM UTC

    We identified a source of the issue. Our SRE team stabilized it, and now we are working on the long run improvement to avoid such a situation in the future.

  3. monitoring Dec 12, 2019, 06:42 AM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Dec 12, 2019, 11:21 AM UTC

    After few hours of monitoring, we confirm that the issue has been solved. The problem with degraded performance was related to a lack of resources on one of our application servers. The node's capacity was quickly utilized because of temporarily increased traffic. Unfortunetly, incorrect configuration in our auto-scaling procedure didn't trigger a process of adding additional resources and scaling up the service. The issue has been solved, and it shouldn't appear in the future. Additionally, we fixed the auto-scaling procedure, and a new solution is being tested now. We sincerely apologize for any inconvenience caused by this issue and appreciate your understanding in this matter.