Qlik Cloud incident

Qlik Cloud - Europe (Ireland)

Qlik Cloud experienced a minor incident on February 25, 2025 affecting Qlik Cloud – AWS – Europe (Ireland), lasting 1h 45m. The incident has been resolved; the full update timeline is below.

Started: Feb 25, 2025, 09:45 AM UTC
Resolved: Feb 25, 2025, 11:30 AM UTC
Duration: 1h 45m
Detected by Pingoru: Feb 25, 2025, 09:45 AM UTC

Affected components

Qlik Cloud – AWS – Europe (Ireland)

Update timeline

investigating Feb 25, 2025, 09:45 AM UTC

We’re currently experiencing a performance degradation issue at Qlik Cloud - Europe (Ireland) region. Customers may experience slowdown and intermittent failures when connecting to Qlik Cloud applications.
identified Feb 25, 2025, 10:05 AM UTC

The issue has been identified and a fix is being implemented.
monitoring Feb 25, 2025, 10:05 AM UTC

Qlik Cloud Service have all recovered, we will continue to monitor to confirm it is restored fully and we will update the status accordingly.
resolved Feb 25, 2025, 11:30 AM UTC

This issue has been resolved. Your patience and support has been greatly appreciated!
postmortem Mar 03, 2025, 06:01 PM UTC

**Root Cause Analysis** On February 25, 2024, from 09:16 to 09:58 UTC, our platform experienced an incident due to a traffic surge, which led to critical services reaching their scaling limits. This resulted in performance degradation for users in Qlik Cloud - Europe \(Ireland\) attempting to browse the hub to open their apps. Qlik promptly initiated mitigation efforts to scale up impacted areas, restoring system stability. **Root Cause Summary and Remediation** The incident was triggered by a sudden increase in traffic that exceeded expected operational thresholds, requiring additional scaling. While automated scaling mechanisms were in place, certain bottlenecks prevented immediate expansion, necessitating manual intervention. Once scaling was applied, stability was restored, and further optimization was performed to enhance long-term resilience. We appreciate your patience and trust as we continuously enhance the performance and stability of our platform. To prevent similar occurrences, we have strengthened our monitoring and alerting capabilities for earlier detection, rescaled and optimized key bottleneck areas, and refined our scaling strategies to improve system resilience under high load conditions.