URL2PNG incident

elevated rate of 500s in production

Notice Resolved View vendor source →

URL2PNG experienced a notice incident on October 22, 2019 affecting API - Legacy, lasting 10h 17m. The incident has been resolved; the full update timeline is below.

Started
Oct 22, 2019, 12:21 PM UTC
Resolved
Oct 22, 2019, 10:38 PM UTC
Duration
10h 17m
Detected by Pingoru
Oct 22, 2019, 12:21 PM UTC

Affected components

API - Legacy

Update timeline

  1. investigating Oct 22, 2019, 12:21 PM UTC

    Some request types are experiencing an elevated rate of 500s. Initial evidence points to transient upstream connectivity issues, but we are investigating as current status updates from AWS/DigitalOcean do not account for our regions at this time.

  2. identified Oct 22, 2019, 12:40 PM UTC

    The issue has been identified and a fix is being implemented.

  3. identified Oct 22, 2019, 04:28 PM UTC

    Related upstream event: https://status.digitalocean.com/incidents/1z3kmlvz69v6

  4. identified Oct 22, 2019, 07:40 PM UTC

    We are continuing to work on a fix for this issue.

  5. identified Oct 22, 2019, 07:45 PM UTC

    The upstream dns/network issue has gotten considerably worse. At this time we have deployed a fleet wide work around. Error rates should continue to drop with this in place. We are actively monitoring the situation and will update as we deploy any additional resolutions.

  6. monitoring Oct 22, 2019, 07:53 PM UTC

    A fix has been implemented and we are monitoring the results.

  7. resolved Oct 22, 2019, 10:38 PM UTC

    Making as resolved at this time. We will post an additional update once public statements are available from AWS on their ongoing DNS challenges today. - https://news.ycombinator.com/item?id=21327980