Close incident

Delayed Email Sync

Minor Resolved View vendor source →

Close experienced a minor incident on October 20, 2022, lasting —. The incident has been resolved; the full update timeline is below.

Started
Oct 20, 2022, 07:03 PM UTC
Resolved
Oct 20, 2022, 09:30 AM UTC
Duration
Detected by Pingoru
Oct 20, 2022, 07:03 PM UTC

Update timeline

  1. resolved Oct 20, 2022, 07:03 PM UTC

    Email sync into the Close application was delayed between 5:30 AM and 6:40 AM EDT.

  2. postmortem Oct 20, 2022, 07:04 PM UTC

    Close sincerely apologizes for the interruption of our service. We take the stability of a platform very seriously. Below is an explanation of what happened and how we will prevent another such interruption from occurring. ## Impact On October 20 from 5:30 AM until 6:40 AM EDT email syncing into the Close application was delayed for some customers. No data was lost during this incident. Email sync functionality was restored by 6:40 AM EDT. ## Root Cause and Resolution The root cause of this incident was a bug in a vendor’s software introduced during scheduled maintenance on 10/18/2022. The impact of this bug was detected on 10/20 and the Close engineering team rolled the affected component back to a known working version to resolve this incident. To prevent this from happening in the future the Close engineering team has added additional monitoring to alert the Close engineering team when container workloads are not being scheduled properly. ## Timeline 10/18 10:56 PM EDT: The Close engineering team upgrades components of our container platform during a scheduled maintenance window. Error rates for scheduling containers begin to slowly increase. 10/20 5:30 AM EDT: Close support escalates issues with email syncing to the engineering team. 10/20 6:10 AM EDT: Close engineering identifies an issue with scheduling container workloads. 10/20 6:22 AM EDT: The Close engineering team begins manual intervention to lessen customer impact. 10/20 6:38 AM EDT: Email sync functionality is largely restored. 10/20 6:39 AM EDT: The Close engineering team isolates the issue to the interface between our secret store and containers which consume secrets. 10/20 9:19 AM EDT: The root cause is identified as a bug introduced in the version of our secret store connector during the 10/18 scheduled maintenance. 10/20 9:25 AM EDT: The Close engineering team rolls back the secret store connector to a known working version resolving the incident.