Peakon incident

Elevated error levels on survey and dashboard

Minor Resolved View vendor source →

Peakon experienced a minor incident on March 17, 2020 affecting Dashboard and Survey and 1 more component, lasting 54m. The incident has been resolved; the full update timeline is below.

Started
Mar 17, 2020, 06:29 AM UTC
Resolved
Mar 17, 2020, 07:23 AM UTC
Duration
54m
Detected by Pingoru
Mar 17, 2020, 06:29 AM UTC

Affected components

DashboardSurveyAPI

Update timeline

  1. investigating Mar 17, 2020, 06:29 AM UTC

    We are seeing an increased error rate for the Peakon dashboard and survey applications due to high database load. We are investigating to find the root cause of the issue.

  2. investigating Mar 17, 2020, 06:51 AM UTC

    We have identified the root cause to be a slow database query, and will be looking at ways to improve performance. Meanwhile, we are processing background jobs, including email sending, at a reduced rate. Error levels have now returned to normal and we are monitoring the situation closely.

  3. investigating Mar 17, 2020, 07:12 AM UTC

    We are continuing to investigate this issue.

  4. identified Mar 17, 2020, 07:14 AM UTC

    The web applications are now operating normally. Background jobs are still being processed at a reduced rate.

  5. resolved Mar 17, 2020, 07:23 AM UTC

    All systems are now back to operating as normal.