Voucherify experienced a major incident on December 12, 2019 affecting EU1 - API, lasting 6h 14m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Dec 12, 2019, 05:07 AM UTC
Communication with API is currently being impacted by the anomalous behavior of one of our European clusters. W are investigating the issue.
- identified Dec 12, 2019, 05:39 AM UTC
We identified a source of the issue. Our SRE team stabilized it, and now we are working on the long run improvement to avoid such a situation in the future.
- monitoring Dec 12, 2019, 06:42 AM UTC
A fix has been implemented and we are monitoring the results.
- resolved Dec 12, 2019, 11:21 AM UTC
After few hours of monitoring, we confirm that the issue has been solved. The problem with degraded performance was related to a lack of resources on one of our application servers. The node's capacity was quickly utilized because of temporarily increased traffic. Unfortunetly, incorrect configuration in our auto-scaling procedure didn't trigger a process of adding additional resources and scaling up the service. The issue has been solved, and it shouldn't appear in the future. Additionally, we fixed the auto-scaling procedure, and a new solution is being tested now. We sincerely apologize for any inconvenience caused by this issue and appreciate your understanding in this matter.