Quintype incident

Sites Down

Major Resolved View vendor source →

Quintype experienced a major incident on December 31, 2019 affecting Breaking-news API, lasting —. The incident has been resolved; the full update timeline is below.

Started
Dec 31, 2019, 05:48 AM UTC
Resolved
Dec 31, 2019, 05:48 AM UTC
Duration
Detected by Pingoru
Dec 31, 2019, 05:48 AM UTC

Affected components

Breaking-news API

Update timeline

  1. investigating Dec 31, 2019, 05:44 AM UTC

    We are currently investigating the issue.

  2. resolved Dec 31, 2019, 05:48 AM UTC

    The issue has been resolved, and was due to a deleted load balancer. We apologise for the technical glitch. Please see the attached post mortem for more details. During this downtime, most publishers were unaffected due to the presence of `stale-if-error` is present in the Cache-Control headers. However, we did see that some publishers saw an error on the homepage if the failure scenario wasn't handled correctly by the front end app. Please contact [email protected] if this is the case.

  3. postmortem Dec 31, 2019, 08:48 AM UTC

    This downtime was caused by an internal load balancer that was deleted by an engineer on the Quintype team. Our internal system detected the deletion and created a new load balancer for traffic, which came up within a minute. However, a proxy server was still pointing to the old load balancer, and needed to be reloaded manually. To prevent these issues in future, we are making the following changes * We are ensuring that all load balancers have delete protection enabled by default. * We are automating the reload of HAProxy to resolve such issues quicker, by updating a DNS record