Teem incident

Degraded performance: app.teem.com and other services

Minor Resolved View vendor source →

Teem experienced a minor incident on December 9, 2019 affecting Web Interface and EventBoard, lasting 6d 18h. The incident has been resolved; the full update timeline is below.

Started
Dec 09, 2019, 09:39 PM UTC
Resolved
Dec 16, 2019, 03:55 PM UTC
Duration
6d 18h
Detected by Pingoru
Dec 09, 2019, 09:39 PM UTC

Affected components

Web InterfaceEventBoard

Update timeline

  1. investigating Dec 09, 2019, 09:39 PM UTC

    We’re currently experiencing a service disruption with reports of loading issues with app.teem.com, EventBoard, and LobbyConnect. Our team is working to identify the root cause and implement a solution. Next update will be by 4:30pm MT/6:30pm ET

  2. investigating Dec 09, 2019, 11:18 PM UTC

    We are continuing to investigate the cause of the issue. The system is catching back up and performance is gradually improving Next update will be by 7pm MT/9pm ET

  3. monitoring Dec 10, 2019, 01:00 AM UTC

    Performance has returned to normal for app.teem.com, Eventboard, LobbyConnect and all other apps/services are also fully functional. We have not identified the root cause yet, but are continuing to investigate and monitor. Next update will be tomorrow by 9am MT/11am ET

  4. monitoring Dec 10, 2019, 03:52 PM UTC

    Performance remains normal. We have made adjustments to our back-end systems and are continuing to monitor the results through the day. Next update today by 4pm MT/6pm ET

  5. investigating Dec 10, 2019, 05:19 PM UTC

    We’re currently experiencing a recurrence of the service disruption, with reports of loading slowness with app.teem.com, EventBoard, LobbyConnect, as well as other services such as device registrations. Our team is actively investigating. Next update will be by 12:30pm MT/2:30pm ET

  6. investigating Dec 10, 2019, 07:21 PM UTC

    We are continuing to investigate the cause of the increased load on our systems. Performance continues to be impacted with extended periods of slowness or failing to load applications. Next update: 3pm MT/5pm ET

  7. investigating Dec 10, 2019, 09:55 PM UTC

    We are continuing to investigate the increased load on the system and its cause. Performance continues to be impacted with extended periods of slowness or failing to load applications. Next update will be by 6pm MT/8pm ET

  8. investigating Dec 11, 2019, 12:41 AM UTC

    Performance has returned to normal for app.teem.com, Eventboard, LobbyConnect and all other apps/services are also fully functional. The investigation actively continues and is focused on finding and addressing the reason for the increased load yesterday and today. Next update will be tomorrow by 9am MT/11am ET

  9. identified Dec 11, 2019, 04:25 PM UTC

    Performance has been normal for app.teem.com, Eventboard, LobbyConnect and all other apps/services since the last update. Teem has identified an abnormal spike in change notifications for certain calendar providers and is in the process of updating handling and processing systems to account for the change in behavior. Next update will be as status changes or tomorrow by 9am MT/11am ET

  10. identified Dec 12, 2019, 07:38 PM UTC

    Performance has been normal for app.teem.com, EventBoard, LobbyConnect and all other apps/services since the last update. Teem is preparing an update to handling and processing systems (currently in testing) and will deploy when complete. Next update will be as status changes or tomorrow by 9am MT/11am ET

  11. monitoring Dec 13, 2019, 06:22 PM UTC

    Performance has been normal for app.teem.com, EventBoard, LobbyConnect and all other apps/services since the last update. Teem has released an update to help prevent degraded performance in this scenario in the future and will continue to monitor throughout the day, at which time we will resolve the incident.

  12. resolved Dec 16, 2019, 03:55 PM UTC

    After monitoring throughout the weekend this incident has been resolved

  13. postmortem Feb 07, 2020, 11:56 PM UTC

    On December 16th the Teem platform received a large spike in traffic from Google Calendar provider which ultimately created a large backlog of asynchronous platform jobs. The systems over balanced these jobs causing heavy load on the database. Tuning the asynchronous capacity for the related workloads helped clear the backlog and prevent further issues. Additional database maintenance was performed in late January to further reduce the possibility of recurrence.