TrekkSoft incident
Registered Incident- Not able to take bookings (Booking Widget, POS, mPOS)
TrekkSoft experienced a major incident on December 6, 2024 affecting TrekkSoft API and TrekkSoft Mobile App (mPOS) and 1 more component, lasting 2d 18h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Dec 06, 2024, 01:51 PM UTC
We are currently experiencing an issue that is preventing bookings from being placed. The affected parts of the system included the Booking Widget, PoS desk, and the mPoS mobile app, which is currently inaccessible. Users who host their site via the Trekksoft Website Builder may not be affected. While channel manager bookings are still being taken via OTAs, they might not be processed on our system at the moment. Our developers are already investigating the root cause of the issue. We apologize for any inconvenience this may have caused and appreciate your patience as we work to resolve the issue. If you have any questions or concerns, please don't hesitate to contact us. We will make sure to keep you updated on the progress of this issue. We will keep you updated on our status page for any further information. If you haven’t subscribed to it yet, we encourage you to do it.
- investigating Dec 06, 2024, 02:11 PM UTC
We are continuing to investigate this issue.
- investigating Dec 06, 2024, 02:12 PM UTC
We are continuing to investigate this issue.
- monitoring Dec 06, 2024, 02:36 PM UTC
A fix has been implemented and we are monitoring the results.
- monitoring Dec 06, 2024, 03:07 PM UTC
After implementing a fix, affected parts became functional again, unfortunately, we see a downgraded performance again preventing bookings from being placed. Our Engineering team keeps on investigating. The affected parts of the system including the Booking Widget, PoS desk, and the mPoS mobile app are currently inaccessible to take bookings. While channel manager bookings are still being taken via OTAs, they might not be processed on our system at the moment. Once again we apologize for the issues this might cause and we will keep you posted.
- monitoring Dec 06, 2024, 04:48 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Dec 09, 2024, 08:37 AM UTC
The incident has been resolved, and all TrekkSoft functionalities are operating as expected since Friday 7:38pm CET. Our investigation revealed that the system slowdown was caused by an increase in CPU usage on our database cluster. By scaling up the database cluster, we were able to stabilize the system. However, we will explore measures to prevent similar issues in the future. A detailed postmortem of the incident will be shared in the coming days. We sincerely apologize once again for any inconvenience caused.
- postmortem Dec 10, 2024, 04:50 PM UTC
**Incident Date**: December 6th 2024 **Incident Duration**: Approximately 4 hours \(intermittently\) **Affected Services**: TrekkSoft Backoffice, POS Desk, TrekkSoft Mobile App \(mPOS\) **Incident Description:** On Friday Dec 6th at 1:44 pm CET the system experienced a slowdown due to a combination of slow schedule queue messages and resource-intensive reporting queries. **Impact**: The slowdown affected TrekkSoft Back office, POS Desk and TrekkSoft mobile App \(mPOS\) which intermittently prevented bookings from being made. **Resolution:** Slow schedule messages were moved to a temporary queue to stabilize the system **Preventive Measures:** To prevent further disruptions, we scaled up the system capacity to handle the increased load ensuring smoother operations over the weekend. The system has since stabilized, and further optimizations are being planned to mitigate future issues.