UTHPC incident

Kubernetes network connections

Minor Resolved View vendor source →

UTHPC experienced a minor incident on January 14, 2026, lasting —. The incident has been resolved; the full update timeline is below.

Started
Jan 14, 2026, 02:15 PM UTC
Resolved
Jan 14, 2026, 02:15 PM UTC
Duration
Detected by Pingoru
Jan 14, 2026, 02:15 PM UTC

Update timeline

  1. resolved Jan 14, 2026, 02:15 PM UTC

    Type: Incident Duration: 1 hour and 31 minutes Affected Components: , kubernetes.hpc.ut.ee, my.lumi-supercomputer.eu, puhuri.metacenter.no, minu.etais.ee, puhuri-portal.neic.no, account.lumi.cscs.ch, docs.hpc.ut.ee, lumi.deic.dk, , hpc.ut.ee, registry.hpc.ut.ee, Waldur portals → UT HPC webservices → Jan 14, 14:15:00 GMT+0 - Investigating - We're currently seeing higher request error and request latency rates to applications in Kubernetes. Working on trying to find a cause. Jan 14, 15:21:43 GMT+0 - Monitoring - We implemented a fix and are currently monitoring the result. Jan 14, 15:46:27 GMT+0 - Resolved - The issue was related to the amount of connections to/from Kubernetes, our firewalls were configured to not allow that high connection rate, but as we are adding new nodes to the cluster, the base rate is exceeding the limits. The limits have been revisited, and new monitoring is being added to not have the same issue in the future.