Harness incident

Test Intelligence service is impacted in Prod1 / Prod3

Minor Resolved View vendor source →
Started
Mar 17, 2026, 06:04 PM UTC
Resolved
Mar 17, 2026, 06:39 PM UTC
Duration
34m
Detected by Pingoru
Mar 17, 2026, 06:04 PM UTC

Affected components

Continuous Integration Enterprise(CIE) - Mac Cloud BuildsContinuous Integration Enterprise(CIE) - Windows Cloud BuildsContinuous Integration Enterprise(CIE) - Linux Cloud Builds

Update timeline

  1. investigating Mar 17, 2026, 06:04 PM UTC

    We are currently investigating this issue.

  2. identified Mar 17, 2026, 06:07 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Mar 17, 2026, 06:10 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Mar 17, 2026, 06:39 PM UTC

    This incident has been resolved.

  5. postmortem Mar 24, 2026, 07:17 PM UTC

    ## **Summary** On March 17, between approximately **3:04 PM and 4:05 PM PST**, customers experienced **degraded performance and intermittent unavailability** for Test Intelligence \(TI\) related APIs. This impacted test result uploads and report access. ## **Root Cause** A recent change introduced additional background processing during test result ingestion, which significantly increased database load. This led to resource saturation and caused elevated latency and temporary service disruption. ## **Impact** * **Degraded performance** for test uploads and report retrieval * **Intermittent API unavailability \(~7 minutes\)** * Affected customers experienced slower response times across TI-related workflows ## **Remediation** ### **Immediate** * Increased database capacity to stabilize performance * Applied configuration changes to reduce system load ### **Permanent** * Optimized processing logic to reduce database overhead * Introduced safeguards to prevent similar load amplification scenarios ## **Action Items** * Improve performance validation using production-scale datasets * Enhance safeguards for new feature rollouts * Strengthen monitoring to detect abnormal load patterns earlier * Add controls to limit impact scope of similar changes

Looking to track Harness downtime and outages?

Pingoru polls Harness's status page every 5 minutes and alerts you the moment it reports an issue — before your customers do.

  • Real-time alerts when Harness reports an incident
  • Email, Slack, Discord, Microsoft Teams, and webhook notifications
  • Track Harness alongside 5,000+ providers in one dashboard
  • Component-level filtering
  • Notification groups + maintenance calendar
Start monitoring Harness for free

5 free monitors · No credit card required