Harness incident
FME SDK is experiencing elevated error rates for Impressions and events
Harness experienced a minor incident on March 17, 2026 affecting FME and FME and 1 more component, lasting 2h 28m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Mar 17, 2026, 04:36 PM UTC
The issue started around ~6:45AM PT and the team is currently investigating
- monitoring Mar 17, 2026, 05:05 PM UTC
We are now monitoring the results.
- resolved Mar 17, 2026, 11:38 PM UTC
This incident has been resolved.
- postmortem Mar 18, 2026, 10:57 PM UTC
## **Summary** _March 17, 2026, FME events & impressions ingestion experienced significant degradation, resulting in elevated latency and error rates. The impact was traced to degraded performance in the underlying shared infrastructure used for event processing._ ## **Root Cause** _We had an unexpected surge in traffic which caused stress on our systems._ ## **Impact** SDKs sending impressions and events would experience elevated error logging and continue to retry, with differing policies depending on the particular SDK and it’s retry policy, which are designed and tailored to each runtime environment to avoid any application impact. In some scenarios, events and impressions may be lost if they are not successfully delivered according to the SDK’s specific retry policy. There was no impact to our control plane services and feature flag delivery and evaluations continued to work without any disruption. ## **Mitigation** _To mitigate, we immediately increased capacity to handle the bursty traffic._ ## **Action Items** To prevent such issues from happening again, we are working on the following items: 1. Evaluate and enforce per customer rate-limit . 2. _Improve the auto-scaling and on-demand network infrastructure scale up._ 3. _Improve resiliency of the ingestion layer_ _._