Cronofy incident

Degraded performance in our US data center

Minor Resolved View vendor source →

Cronofy experienced a minor incident on December 15, 2022 affecting Scheduler and API and 1 more component, lasting 2h 25m. The incident has been resolved; the full update timeline is below.

Started
Dec 15, 2022, 04:11 PM UTC
Resolved
Dec 15, 2022, 06:36 PM UTC
Duration
2h 25m
Detected by Pingoru
Dec 15, 2022, 04:11 PM UTC

Affected components

SchedulerAPIBackground ProcessingDeveloper Dashboard

Update timeline

  1. investigating Dec 15, 2022, 04:11 PM UTC

    We are investigating degraded performance in our US data center

  2. investigating Dec 15, 2022, 04:34 PM UTC

    We have taken steps to ease the pressure on our US database. This has resulted in better, but still degraded, performance. We're continuing to investigate the root cause

  3. identified Dec 15, 2022, 05:28 PM UTC

    Our US database is experiencing slower than usual disk performance. We have taken steps to ease the pressure, such as temporarily disabling maintenance tasks. The amount of queued work is reducing. We're continuing to work to bring performance levels back to the usual level. We have also added bigger database nodes to the cluster in case we need to failover to those. However, this would require a short outage and so we are holding off on failing over to those just yet.

  4. monitoring Dec 15, 2022, 05:58 PM UTC

    The queued work has now been processed and we can see that performance is no longer degraded. We are continuing to monitor the situation.

  5. resolved Dec 15, 2022, 06:36 PM UTC

    Performance was degraded in our US data center for around 2 hours between 16:00 and 18:00 UTC. This was down to our primary database struggling under load. Steps were taken to remove any background processes to reduce the load as much as possible to aid the system to return to regular operation. API responses may have been slower than usual during this period, and background processing such as synchronizing calendar data will also have been slower than usual with messages taking up to 3 minutes to be picked up at the peak of the incident. We will be bringing forward the maintenance to upgrade this database cluster from this coming Sunday 18th December to tomorrow Friday 16th December. A notice for this maintenance change will be posted shortly.