Elium experienced a critical incident on December 11, 2020 affecting Private Hosting, lasting 2h 43m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Dec 11, 2020, 01:33 PM UTC
We are currently investigating this issue.
- investigating Dec 11, 2020, 01:34 PM UTC
Instances hosted in our private hosting facility are unreachable because our internet connectivity is down
- investigating Dec 11, 2020, 01:35 PM UTC
We switched our internet connectivity to our backup provider
- identified Dec 11, 2020, 01:38 PM UTC
We had to update our DNS records to point to our backup external IP addresses, depending on the cached value, this might take some minutes to propagate
- monitoring Dec 11, 2020, 01:49 PM UTC
The datacenter has confirmed they have a problem with one of their internet provider, our backup provider is unaffected
- monitoring Dec 11, 2020, 02:40 PM UTC
We are having DNS issues on some of our private hosting facility since the upstream switch
- monitoring Dec 11, 2020, 02:43 PM UTC
Our internal DNS resolver was still set to the failing primary internet line, and has been switched to use our backup line DNS provider
- monitoring Dec 11, 2020, 02:54 PM UTC
We identified another issue related to serving of thumbnail/file contents that should be resolved as soon as the new DNS record propagates
- resolved Dec 11, 2020, 04:17 PM UTC
The upstream provider connectivity has been resumed in our datacenter
- postmortem Dec 17, 2020, 09:58 AM UTC
Vendredi 11/12/2020 – 14 :25 : remontée d’une alarme backbone concernant le switch B19B4530WIN0 et qq autres équipements situés en aval Vendredi 11/12/2020 – 14 :30 : basic troubleshooting – panne electrique supposée Vendredi 11/12/2020 – 15 :10 : arrivée ingénieur au WDC – qq tests effectués sur l’alimentation et les ventilateurs du B19B4530WIN0 Vendredi 11/12/2020 – 15 :20 : tests non probants – nous decidons de remplacer le chassis du B19B4530WIN0. Le B19B4530WIN0 est constitué de 2 chassis en stack et le chassis défectueux est identifié comme étant le C3750-X – disponible en spare backbone au stock à Wierde. Vendredi 11/12/2020 – 15 :30 : sortie du CAT3750-X spare du stock et transfert jusque WDC Vendredi 11/12/2020 – 15 :30 – 16 :15 : détricotage et reperage des connexions UTP se terminant sur le B19B4530WIN0 pour preparer la migration Vendredi 11/12/2020 – 16 :10 : arrivée du switch spare au WDC. Vendredi 11/12/2020 – 16 :15 : configuration du switch spare. Vendredi 11/12/2020 – 16 :40 : remplacement du switch défectueux. Vendredi 11/12/2020 – 17 :00 : formation du stack entre les 2 membres du switch et début du replacement des cables UTP Vendredi 11/12/2020 – 17 :04 : reboot du switch pour configuration du system MTU. Vendredi 11/12/2020 – 17 :07 : fin du replacement des connexions UTP sur le switch spare. Vendredi 11/12/2020 – 17 :07 : fin de l’intervention ROOT-CAUSE : panne hardware du chassis B19B4530WIN0