SYNAQ incident

SYNAQ Cloud Mail Incident - 11/06/2019

Minor Resolved View vendor source →

SYNAQ experienced a minor incident on June 11, 2019 affecting SYNAQ Cloud Mail, lasting 8h 1m. The incident has been resolved; the full update timeline is below.

Started
Jun 11, 2019, 06:53 AM UTC
Resolved
Jun 11, 2019, 02:55 PM UTC
Duration
8h 1m
Detected by Pingoru
Jun 11, 2019, 06:53 AM UTC

Affected components

SYNAQ Cloud Mail

Update timeline

  1. investigating Jun 11, 2019, 06:53 AM UTC

    Dear Clients, SYNAQ Cloud Mail is experiencing mail delays. We are aware of the issue and we are working on the resolution as a matter of urgency.

  2. monitoring Jun 11, 2019, 01:53 PM UTC

    Dear Clients, The Cloud Mail incident has been resolved. Mail queues are currently working through the backlog and engineers are monitoring

  3. resolved Jun 11, 2019, 02:55 PM UTC

    Dear Clients Mail flow is optimal again

  4. postmortem Jul 23, 2019, 02:01 PM UTC

    Summary and Impact to Customers On Monday 10th June from 1:38pm to Tuesday 11th June 4:55pm, SYNAQ Cloud Mail experienced a minor service incident which caused mail delays for a subset of clients. The resultant impact of the event was that certain users experienced mail delays of up to 2 hours for some of their mail. Root Cause and Solution The root cause of this event was due to a failed controller on a backend storage device. As a result of the failure, all data accessing the storage had to failover to a single data path. This failover caused an abnormal level of usage over the single data path, resulting in increased latency times for data reads and writes, which in turn caused mail delivery to users mailboxes to be delayed. In order to resolve this issue, the controller was replaced and dual paths were restored. As a result, mail delivery performance was restored to the affected users. Remediation Actions • Whilst the built-in redundancy on the storage array worked to prevent a complete loss of access to mailbox data, SYNAQ engineers are working with our storage vendor to increase the single data path capacity to handle failover load without impacting users.