TrekkSoft incident

Global issues in several Trekksoft production functionalities

Major Resolved View vendor source →

TrekkSoft experienced a major incident on February 15, 2024 affecting TrekkSoft Backoffice and TrekkSoft API and 1 more component, lasting 3h 16m. The incident has been resolved; the full update timeline is below.

Started
Feb 15, 2024, 11:00 AM UTC
Resolved
Feb 15, 2024, 02:16 PM UTC
Duration
3h 16m
Detected by Pingoru
Feb 15, 2024, 11:00 AM UTC

Affected components

TrekkSoft BackofficeTrekkSoft APITrekkSoft Mobile App (mPOS)POS DeskTrekkSoft Website BuilderExperienceBank - Channel Manager

Update timeline

  1. investigating Feb 15, 2024, 11:00 AM UTC

    We are currently experiencing global issues in several Trekksoft production functionalities (a.o. Backoffice, POS Desk). Our developers are already investigating on finding the root cause of the issue. We will keep you updated and apologize for the inconvenience caused.

  2. monitoring Feb 15, 2024, 11:42 AM UTC

    The issue has been identified and a fix has been implemented. All system components are operational. We'll keep monitoring it.

  3. resolved Feb 15, 2024, 02:16 PM UTC

    The incident has been resolved and all the Trekksoft functionalities are again operational as normal. We will provide a postmortem of the incident in the following days. Once again we want to apologize for any inconvenience this might have caused you.

  4. postmortem Feb 19, 2024, 09:33 AM UTC

    **Incident Date**: February 15, 2024 **Incident Duration**: Approximately 1 hour **Affected Services**: All services **Incident Description**: At approximately 11:30 AM CET on February 15, 2024, a bug fix caused an unexpected slowdown of our database. As a result, all services dependent on this database were impacted. **Impact**: The incident rendered it impossible to perform operations on the database, leading to a halt in functionality for all of our services. This disruption lasted for around one hour until our developers were able to mitigate the issue. **Resolution**: The incident was resolved by rolling back the release. **Preventive Measures and Recommendations:** 1. **Testing in our test environment:** Improve options to test database impacts on our test environment, incl. smoke/regression/load tests. 2. **Speed up our roll-back process** Despite our strict release policy which includes thorough tests on our test environment, for this specific bug fix, our test environment exceptionally didn't cover all implications. By implementing these measures, we can proactively address and mitigate similar incidents in the future, ensuring the continued reliability of our services.