miLibris incident

Heavy load on CDN

Minor Resolved View vendor source →

miLibris experienced a minor incident on October 22, 2016, lasting 2h 56m. The incident has been resolved; the full update timeline is below.

Started
Oct 22, 2016, 07:00 AM UTC
Resolved
Oct 22, 2016, 09:56 AM UTC
Duration
2h 56m
Detected by Pingoru
Oct 22, 2016, 07:00 AM UTC

Update timeline

  1. investigating Oct 22, 2016, 08:37 AM UTC

    We are currently investigating this issue.

  2. resolved Oct 22, 2016, 09:56 AM UTC

    This incident has been resolved.

  3. postmortem Aug 01, 2018, 07:37 PM UTC

    Saturday morning, we had a 2h service perturbation due to an overload of our CDN : we exeed our bandwidth limits. After fixing this, it was time to know what really happened. Indeed, morning spikes is part of our business : we monitor our bandwidth and have a very large allowance on this. ### Problem origin The problem was in fact in our processing chain. We use a pool of 20 dedicated servers to convert client inputs and package final content before sending it through our delivery infrastructure. All these 20 servers are sync with a central configuration manager. We notice that there was some problem to sync configuration files, in particular the configuration of the PDF / assets size optimization process : Some of our servers was not able to optimize file size. An increasing number of our release was sent in our delivery infrastructure without any optimization. Some of them was larger than 500mb, and increase drastically the amount of bandwidth needed. This was now fixed and we are back to a normal situation.