Harness incident

Hosted CI build vm environment is seeing higher network latency

Minor Resolved View vendor source →
Started
Feb 19, 2026, 05:36 PM UTC
Resolved
Feb 19, 2026, 06:20 PM UTC
Duration
44m
Detected by Pingoru
Feb 19, 2026, 05:36 PM UTC

Affected components

Continuous Integration Enterprise(CIE) - Mac Cloud BuildsContinuous Integration Enterprise(CIE) - Mac Cloud BuildsContinuous Integration Enterprise(CIE) - Windows Cloud BuildsContinuous Integration Enterprise(CIE) - Mac Cloud BuildsContinuous Integration Enterprise(CIE) - Mac Cloud BuildsContinuous Integration Enterprise(CIE) - Windows Cloud BuildsContinuous Integration Enterprise(CIE) - Linux Cloud BuildsContinuous Integration Enterprise(CIE) - Windows Cloud BuildsContinuous Integration Enterprise(CIE) - Windows Cloud BuildsContinuous Integration Enterprise(CIE) - Linux Cloud BuildsContinuous Integration Enterprise(CIE) - Linux Cloud BuildsContinuous Integration Enterprise(CIE) - Linux Cloud Builds

Update timeline

  1. investigating Feb 19, 2026, 05:36 PM UTC

    We are currently investigating this issue.

  2. identified Feb 19, 2026, 05:39 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Feb 19, 2026, 06:11 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Feb 19, 2026, 06:20 PM UTC

    This incident has been resolved.

  5. postmortem Mar 02, 2026, 05:32 PM UTC

    ## **Summary** On February 19, 2026, a partial degradation occurred in the CI infrastructure in the **us-west1** region due to issues affecting the NAT control plane. During a brief window \(~30 minutes\), a limited number of CI build jobs failed during VM initialization. The issue was detected through internal monitoring and mitigated via controlled failover, followed by restoration of the affected NAT instances. ## **Root Cause** ‌ The incident was caused by saturation of connection tracking \(iptables/conntrack state\) on NAT virtual machines in the us-west1 region. A short-lived spike in build VM activity led to a burst of metadata-related connections. Over time, stale connection entries accumulated without automated cleanup, eventually preventing the NAT VMs from accessing the cloud metadata service. This metadata connectivity disruption impacted control-plane functionality \(including VM provisioning\), which resulted in a limited number of build initialization failures. ## **Impact** ‌ * **Region Impacted:** us-west1 * **Customer Impact:** * Limited CI job failures during VM provisioning * Two customers experienced isolated build failures * No impact to running workloads * **Data Loss:** None * **Duration:** Approximately 30 minutes Customers were advised to retry failed builds after mitigation. ## **Mitigation** ‌ * Traffic was automatically failed over to another NAT to maintain egress functionality. * Affected s NAT VMs were restarted to clear saturated connection state. * Metadata connectivity, SSH access, monitoring, and health checks were verified. * Traffic was gradually restored to affected NAT after stability confirmation. * Cloud NAT IP utilization was monitored during failover to prevent capacity exhaustion. ### **Action Items and Permanent Preventive Measures** To prevent such issues from happening again, we will * Implement automated cleanup of metadata-related connection tracking state. * Add proactive health checks and alerts for metadata reachability. * Strengthen monitoring for NAT VM control-plane health degradation. * Enhance fallback guardrails and capacity validation for Cloud NAT. * Deploy automated iptables/conntrack cleanup for metadata traffic. * Improve monitoring for metadata connectivity health.

Looking to track Harness downtime and outages?

Pingoru polls Harness's status page every 5 minutes and alerts you the moment it reports an issue — before your customers do.

  • Real-time alerts when Harness reports an incident
  • Email, Slack, Discord, Microsoft Teams, and webhook notifications
  • Track Harness alongside 5,000+ providers in one dashboard
  • Component-level filtering
  • Notification groups + maintenance calendar
Start monitoring Harness for free

5 free monitors · No credit card required