INKY incident

Incoming Email and the Analysis Dashboard currently delayed

Major Resolved View vendor source →

INKY experienced a major incident on September 16, 2023 affecting Dashboard Services US and Inky Region 1 - Southeast US and 1 more component, lasting 1h 58m. The incident has been resolved; the full update timeline is below.

Started
Sep 16, 2023, 04:03 PM UTC
Resolved
Sep 16, 2023, 06:01 PM UTC
Duration
1h 58m
Detected by Pingoru
Sep 16, 2023, 04:03 PM UTC

Affected components

Dashboard Services USInky Region 1 - Southeast USDashboard Services EUInky Region 2 - Eastern USInky Region 3 - Northwest USInky EU

Update timeline

  1. identified Sep 16, 2023, 04:03 PM UTC

    We have identified the issue and implementing the fix now. Will update as soon as it is in place and traffic returns to normal.

  2. identified Sep 16, 2023, 04:30 PM UTC

    Fix is in place for mail flow and we are currently expanding capacity to handle the backlog of email that built up during the outage.

  3. monitoring Sep 16, 2023, 04:55 PM UTC

    Fix has been applied on the Dashboard. The backlog of emails have cleared with the additional capacity. We will continue to monitor.

  4. resolved Sep 16, 2023, 06:01 PM UTC

    All systems are back to normal and steady.

  5. postmortem Sep 18, 2023, 02:24 PM UTC

    # Post incident report: Start: 16-September-2023 1400 UTC End: 16-September-2023 1630 UTC Duration: 2 hours 30 min ## Summary: Mail was delayed and the sections of the Inky Dashboard were unavailable for approximately two and a half hours on Saturday, September 16th due an expired server certificate. ## Root Cause: Due to human error, a server certificate expired before we were able to replace it with a renewed one. ## Mitigation Action: We renewed the server certificate and increased the capacity to process the backlog as quickly as possible to get traffic back to normal. ## Customer Impact: For approximately two and a half hours, users' email was delayed during the period between the server expiring and us installing the new server certificate. During this period, there were also sections of the Inky Dashboard that were unavailable. ## Follow-up Items and Preventative Measures: * We have implemented multiple alerting and monitoring processes to mitigate the danger of missing any future server certificate renewals.