Ziron incident

[various] database issues

Major Resolved View vendor source →

Ziron experienced a major incident on December 5, 2018 affecting API v1 and Outbound Voice and 1 more component, lasting 3h 9m. The incident has been resolved; the full update timeline is below.

Started
Dec 05, 2018, 11:21 PM UTC
Resolved
Dec 06, 2018, 02:31 AM UTC
Duration
3h 9m
Detected by Pingoru
Dec 05, 2018, 11:21 PM UTC

Affected components

API v1Outbound VoiceInbound SMSOutbound SMSZiron Dashboard

Update timeline

  1. investigating Dec 05, 2018, 11:21 PM UTC

    We are currently investigating database issues that may be affecting API, SMS, outbound voice and dashboard services.

  2. monitoring Dec 05, 2018, 11:29 PM UTC

    The affected database servers have been removed from the pool and normal service has been restored.

  3. identified Dec 05, 2018, 11:54 PM UTC

    Unfortunately we are seeing a return of previous issues, and are working to restore service

  4. monitoring Dec 05, 2018, 11:55 PM UTC

    Service has again been restored. We will be providing a further update shortly.

  5. monitoring Dec 06, 2018, 12:15 AM UTC

    The affected database cluster continues to be stable. We are working on remedial non-service affecting action and will provide an update by 0100 UTC.

  6. monitoring Dec 06, 2018, 12:59 AM UTC

    Most services remain stable, although we are currently seeing degraded performance for inbound and outbound SMS. An update will follow by 0130 UTC.

  7. monitoring Dec 06, 2018, 01:13 AM UTC

    Full service has been restored for inbound and outbound SMS. We will continue to monitor and investigate the earlier database issues, and will provide an update by 02:30 UTC.

  8. resolved Dec 06, 2018, 02:31 AM UTC

    This incident has been resolved. A postmortem will follow shortly.