Teem incident

Multiple service outage affecting Teem platform

Critical Resolved View vendor source →

Teem experienced a critical incident on January 25, 2020 affecting Web Interface and Mobile Data and 1 more component, lasting 2h 57m. The incident has been resolved; the full update timeline is below.

Started
Jan 25, 2020, 04:07 PM UTC
Resolved
Jan 25, 2020, 07:05 PM UTC
Duration
2h 57m
Detected by Pingoru
Jan 25, 2020, 04:07 PM UTC

Affected components

Web InterfaceMobile DataAPIEventBoardLobbyConnect

Update timeline

  1. identified Jan 25, 2020, 04:07 PM UTC

    Teem has identified the issue and is working to restore service.

  2. monitoring Jan 25, 2020, 05:25 PM UTC

    Service has been restored and Teem is monitoring results.

  3. resolved Jan 25, 2020, 07:05 PM UTC

    On Jan 25 at approximately 0530 MT a failing maintenance and backup process began causing back-pressure up the replication line of Teem databases. This in turn caused normal and overflow disk space to be filled and eventually led to a primary database becoming unavailable and the service disruption that followed. The failing process was identified and corrected, the blockage released and the systems were brought back online at 1020 MT. Data and replication backlogs will continue to work their way down throughout the weekend and progress will continue to be monitored. We apologize for any inconvenience this downtime may have caused and thank you for your continued use of the Teem platform.

  4. postmortem Feb 08, 2020, 12:05 AM UTC

    On Jan 25 at approximately 0530 MT a failing maintenance and backup process began causing back-pressure up the replication line of Teem databases. This in turn caused normal and overflow disk space to be filled and eventually led to a primary database becoming unavailable and the service disruption that followed. The failing process was identified and corrected, the blockage released and the systems were brought back online at 1020 MT.