SYNAQ incident

SYNAQ Securemail Incident - 25/05/2020

Major Resolved View vendor source →

SYNAQ experienced a major incident on May 25, 2020 affecting SYNAQ Securemail, lasting 3h 1m. The incident has been resolved; the full update timeline is below.

Started
May 25, 2020, 06:25 AM UTC
Resolved
May 25, 2020, 09:27 AM UTC
Duration
3h 1m
Detected by Pingoru
May 25, 2020, 06:25 AM UTC

Affected components

SYNAQ Securemail

Update timeline

  1. investigating May 25, 2020, 06:25 AM UTC

    SYNAQ Securemail is currently experiencing an incident where emails are intermittently delayed when sending or receiving. Engineers are investigating the incident as a matter of urgency.

  2. resolved May 25, 2020, 09:27 AM UTC

    Dear Clients, The SYNAQ Securemail Incident has been resolved and the service has returned to optimal functionality.

  3. postmortem May 27, 2020, 09:13 AM UTC

    Summary and Impact to Customers On Monday 25th May 2020 from 02:53 to 11:27, SYNAQ Securemail experienced a mail timeout incident. The resultant impact of the event was intermittent connectivity to and from our platform when trying to send outbound mail. This caused the delay of outbound mail. Root cause and Solution The root cause of this event was due to the change logged on Sunday the 24th to move the Securemail environment to a new location within the Data Centre. The Securemail outbound servers all have multiple IP addresses assigned to them in order to combat blacklisting. The outbound servers utilise a script that only allows for one IP address to be brought up per server at a time. Unfortunately, this script failed, and multiple IP addresses were brought up at the same time, thus causing a conflict. This caused timeouts when trying to connect to and from these servers on port 25, causing the delay of sending mail. In order to resolve this issue, the additional IP addresses were removed so that only a single IP address remained on each server and mail returned to normal functionality and began to flow. Remediation Actions • The automated scripts that failed have been removed and replaced with improved technology. • An improved monitoring check is being built to detect this scenario earlier, should it reoccur.