Harness experienced a notice incident on May 8, 2026, lasting —. The incident has been resolved; the full update timeline is below.
Update timeline
- resolved May 08, 2026, 05:08 PM UTC
On April 22, 2026, couple of customers experienced feature flags returning default treatments due to invalid empty segment membership responses being cached and served to SDKs. The issue was isolated to couple of customers under specific high-concurrency conditions involving Large Segments.
- postmortem May 08, 2026, 05:09 PM UTC
## Summary On April 22, 2026, a couple of customer environments experienced feature flags returning default treatments due to invalid empty segment membership responses being cached and served to SDKs. The issue was isolated to couple of customers under specific high-concurrency conditions involving Large Segments. ## Impact * Feature flag evaluations returned default \(fallback\) values for affected keys * Limited to a single customer environment with ~5.5M keys; approximately 0.9% of keys were affected * No other customers or environments were impacted * The condition was triggered by a traffic spike associated with Large Segment updates ## Root Cause Under high-concurrency conditions triggered by Large Segment updates, an internal rate-limiting mechanism produced responses that were incorrectly interpreted by the data retrieval layer as valid empty results. These empty responses were then cached and served to SDKs, causing affected keys to evaluate with default treatments instead of their configured values. The condition could not occur under normal traffic patterns and required a specific combination of Large Segment updates generating concurrent requests exceeding internal thresholds. ## Remediation **Fix deployed:** Updated the data retrieval path to correctly identify and reject invalid responses from the rate-limiting mechanism, preventing empty data from being cached regardless of traffic conditions. The fix was validated under load testing that reproduced the original failure. ## Action Items * **Recurring load testing in production:** Implementing periodic load tests in a controlled production environment that exhaust service capacity to replicate this class of failure. All keys in the controlled environment will be continuously validated, enabling early detection of data integrity degradation before it reaches customer environments.