Honeycomb.io incident

Delays in SLO, Service Maps processing

Honeycomb.io experienced a major incident on October 20, 2025 affecting ui.honeycomb.io - US1 Querying and ui.honeycomb.io - US1 Trigger & SLO Alerting, lasting 1d 1h. The incident has been resolved; the full update timeline is below.

Started: Oct 20, 2025, 02:16 PM UTC
Resolved: Oct 21, 2025, 03:41 PM UTC
Duration: 1d 1h
Detected by Pingoru: Oct 20, 2025, 02:16 PM UTC

Affected components

ui.honeycomb.io - US1 Queryingui.honeycomb.io - US1 Trigger & SLO Alerting

Update timeline

monitoring Oct 20, 2025, 02:16 PM UTC

Due to constrained EC2 instance capacity in the AWS us-east-1 region, we are choosing to allocate the capacity we have to incoming events and telemetry. As such, customers may see over 5 minute delays in our processing of - SLOs - Service Maps We do not expect a degradation of our core ingest -> query flow, and we do not expect triggers to be impacted
monitoring Oct 20, 2025, 03:14 PM UTC

SLO processing has recovered. Service Maps continues to be degraded but historical data can be queried.
monitoring Oct 20, 2025, 03:30 PM UTC

The networking issues in us-east-1 is affecting our query engine - customers may see errors or delays when running queries as a result.
monitoring Oct 20, 2025, 05:39 PM UTC

We’re currently observing that querying is seeing signs of recovery. We are continuing to monitor the situation
monitoring Oct 20, 2025, 08:06 PM UTC

AWS is starting to show signs of recovery. Querying remains partially impacted and may take longer to return results. We are continuing to monitor the situation.
monitoring Oct 20, 2025, 11:08 PM UTC

Service maps are up and running and errors have resolved. Querying, triggers and SLOs are continuing to improve and customers may see delays as functionality continues to improve. We are continuing to monitor the situation.
monitoring Oct 21, 2025, 12:11 AM UTC

Honeycomb core functionality is operational. Service maps are up and running and errors have resolved. Querying, triggers and SLOs are continuing to improve and customers may see delays as functionality continues to improve. This is our final update for the night unless the situation degrades. We will continue to monitor the situation.
resolved Oct 21, 2025, 03:41 PM UTC

This incident has been resolved. At no point did we lose any customer data that hit our load balancers. Querying has been stable for hours, and all features that were degraded are functional.