TrekkSoft incident

TrekkSoft servers issue affecting several production functionalities

Major Resolved View vendor source →

TrekkSoft experienced a major incident on July 6, 2021 affecting TrekkSoft Backoffice and TrekkSoft API and 1 more component, lasting 40m. The incident has been resolved; the full update timeline is below.

Started
Jul 06, 2021, 05:28 PM UTC
Resolved
Jul 06, 2021, 06:08 PM UTC
Duration
40m
Detected by Pingoru
Jul 06, 2021, 05:28 PM UTC

Affected components

TrekkSoft BackofficeTrekkSoft APITrekkSoft Mobile App (mPOS)POS DeskTrekkSoft Website Builder

Update timeline

  1. investigating Jul 06, 2021, 05:28 PM UTC

    We are currently experiencing issues with Trekksoft servers. This is affecting most of our product functionalities. Our developers are investigating with highest priority to identify the root cause of the issue and it’s resolution. We will keep you updated and apologize for the inconvenience caused.

  2. resolved Jul 06, 2021, 06:08 PM UTC

    Issue with the TrekkSoft servers has been resolved. All the affected services are back working as expected. Our developers are continuing to monitor the issue closely to ensure there are no further performance issues.

  3. postmortem Jul 07, 2021, 01:28 PM UTC

    **Root cause** New release containing database changes was being pushed to production which locked access to some database tables. Due to high volume of requests to access those tables, and them being locked, requests for access were piling up in the queue which overloaded our servers. **What happened** For a certain amount of time \(estimated at 45 min\) users were not able to access their TrekkSoft sites due to our production servers being down. **What we did** When our developers identified the new release is impacting the performance of our services and overloading the servers they had to restart those and return them to their normal functioning state with a rollback. **The consequences** During the 45 min time window of the incident no bookings have been able to be processed. **Learnings** In order to prevent similar incidents in the future we updated our internal guidelines on new releases to avoid releasing critical changes during peek hours when high availability is needed. We apologize for any inconvenience this might have caused you.