Gini experienced a notice incident on November 25, 2019, lasting 10h 15m. The incident has been resolved; the full update timeline is below.
Update timeline
- investigating Nov 25, 2019, 10:13 PM UTC
We are currently investigating this issue.
- identified Nov 25, 2019, 11:51 PM UTC
The issue has been identified and a fix is being implemented.
- monitoring Nov 26, 2019, 02:20 AM UTC
A fix has been implemented and we are monitoring the results.
- resolved Nov 26, 2019, 08:28 AM UTC
This incident has been resolved.
- postmortem Nov 26, 2019, 02:22 PM UTC
Our production system was affected by an outage that lasted, with varying degrees of impact for our partners, from 25.11.2019 23:00 until 26.11.2019 03:15. **Document processing and API availability was seriously impacted until 01:15**. The initial outage was caused by a failure in one of our Kubernetes clusters. We operate a redundant control plane but this specific problem still resulted in a downtime of the Kubernetes API server. The API server is vital to the functionality of the cluster and without it the problem spread into wide parts of our infrastructure. After the API functionality was restored we needed longer than anticipated to identify, understand and fix a networking issue that was caused by the API outage in the first place. After the networking problems were fixed we finished the reactivation of affected data stores and databases to restore full service functionality. **No data was lost and all the documents that we accepted have been processed**. We strive to provide the best service and availability to our partners and customers and we didn’t live up to this goal. We already identified ways to circumvent similar issues in the future and we are already working on the implementation. Feel free to contact us at [[email protected]](mailto:[email protected]) if you have any questions about Gini or this outage.