Cornerstone incident

Intermittent Galaxy Latency

Cornerstone experienced a notice incident on January 27, 2026 affecting Uptime and Uptime and 1 more component, lasting 2d 17h. The incident has been resolved; the full update timeline is below.

Started: Jan 27, 2026, 05:46 PM UTC
Resolved: Jan 30, 2026, 11:00 AM UTC
Duration: 2d 17h
Detected by Pingoru: Jan 27, 2026, 05:46 PM UTC

Affected components

UptimeUptimeUptimeUptimeUptimeResponse TimeResponse TimeResponse TimeResponse TimeResponse Time

Update timeline

monitoring Jan 27, 2026, 05:46 PM UTC

Some Galaxy customers on UK SL1, UK SL2, UK SL3, FRA SL1, and CDG SL1 may have experienced intermittent latency between 1:30 AM and 5:30 AM PT. Performance has since stabilized and returned to normal levels. Our teams are actively monitoring the environment to ensure sustained stability.
resolved Jan 30, 2026, 11:00 AM UTC

A fix was implemented on January 27, and we have not observed a similar recurrence during continued monitoring. The issue is now considered resolved.
postmortem Feb 02, 2026, 04:00 PM UTC

**Incident Summary:** Between January 14 and January 27, 2026, some users accessing LXP EU PRD 02 experienced intermittent latency while navigating the application. During this time, users experienced slower page loads and delayed responses across parts of the platform, resulting in temporarily degraded performance. CSX and Galaxy customers on UK SL1, UK SL2, UK SL3, FRA SL1, and CDG SL1 swimlanes may also have experienced related impact. **Root Cause:** The performance issues were caused by inefficiencies in application caching behavior, which increased load on backend services and reduced overall responsiveness. Further investigation determined that regressive calls to the caching layer, combined with exhaustion of the cache network pool, led to sustained performance degradation. Full identification of the root cause required extended analysis and collaboration with AWS infrastructure partners. **Corrective Actions:** The issue was addressed through a phased approach: Hotfix 1 \(January 14 - 19, 2026\) * Action: Updated cache invalidation logic to prevent unnecessary bulk cache operations when users joined or left communities * Outcome: Performance improved, but intermittent latency continued under certain conditions Hotfix 2 \(January 23 - 25, 2026\) * Action: Optimized caching behavior related to skills translations and status updates in Skills Studio * Outcome: Additional performance improvements were observed; however, the underlying issue was not fully resolved Additional Mitigation Measures * Upscaled database and application infrastructure to support increased load * Tuned application pod and thread configurations Final Fix \(January 27, 2026\) * Action: Increased the cache network pool size in alignment with AWS recommendations * Outcome: Application performance was fully restored, and the incident is considered resolved **Preventive Measures:** To reduce the risk of recurrence, we are implementing the following improvements: * Enhancements to optimize cache management * Comprehensive review of caching workflows and network pool utilization * Improved monitoring to proactively detect cache inefficiencies and early signs of performance degradation * Ongoing optimization of application and infrastructure configurations to support sustained performance