TrekkSoft incident

Registered system slowdown December the 2nd 2024

Minor Resolved View vendor source →

TrekkSoft experienced a minor incident on December 2, 2024 affecting TrekkSoft Backoffice and TrekkSoft Mobile App (mPOS) and 1 more component, lasting 1d 1h. The incident has been resolved; the full update timeline is below.

Started
Dec 02, 2024, 11:40 AM UTC
Resolved
Dec 03, 2024, 12:52 PM UTC
Duration
1d 1h
Detected by Pingoru
Dec 02, 2024, 11:40 AM UTC

Affected components

TrekkSoft BackofficeTrekkSoft Mobile App (mPOS)POS Desk

Update timeline

  1. investigating Dec 02, 2024, 11:40 AM UTC

    We are currently experiencing a system delay that is causing a general slowdown of the system, this might result in error messages being shown and also might prevent some bookings from being placed. The affected areas include the booking widget, PoS desk and the mPoS mobile app. Our developers are actively investigating the root cause of this issue. We sincerely apologize for any inconvenience this may have caused and appreciate your patience as we work toward a resolution. If you have any questions or concerns, please feel free to reach out to us. We will keep you informed of any updates through our https://status.trekksoft.com page. If you haven't subscribed yet, we encourage you to do so to stay updated.

  2. investigating Dec 02, 2024, 12:37 PM UTC

    We are continuing to investigate this issue.

  3. monitoring Dec 02, 2024, 12:39 PM UTC

    We have identified the root cause, applied a fix, and are currently monitoring the results.

  4. monitoring Dec 02, 2024, 05:27 PM UTC

    We are continuing to monitor for any further issues.

  5. resolved Dec 03, 2024, 12:52 PM UTC

    The incident has been resolved, and all TrekkSoft functionalities are operating as expected. Our investigation revealed that the system slowdown, which caused error messages during booking attempts and prevented some bookings from being processed, was triggered by an unexpected increase in system requests (e.g. a high volume of reporting requests). We have implemented a fix that appears to have resolved the issue and are actively exploring measures to prevent similar issues in the future. A detailed postmortem of the incident will be shared in the coming days. We sincerely apologize once again for any inconvenience caused.

  6. postmortem Dec 09, 2024, 02:14 PM UTC

    **Incident Date**: December 2nd 2024 **Incident Duration**: Approximately 5 hours \(intermittently\) **Affected Services**: TrekkSoft Backoffice, POS Desk, TrekkSoft Mobile App \(mPOS\) **Incident Description**: On Monday, Dec 2nd, at 12:13 PM CET, system slowness was reported, leading to an inability to process bookings. Earlier that morning, an alert indicated web response times exceeding 1.0 seconds for over 10 minutes. **Impact**: Investigation revealed it was caused by resource-heavy queries, making the system unresponsive. **Resolution**: This issue coincided with increased demand for year-end reports, likely contributing to the slowdown. Queues for reporting and scheduling were overloaded, with pending messages being gradually consumed after being moved to temporary queues. **Preventive Measures:** Further monitoring and optimization are ongoing to prevent similar issues.