Honeycomb.io incident

Delays in SLO, Service Maps processing

Major Resolved View vendor source →

Honeycomb.io experienced a major incident on October 20, 2025 affecting ui.honeycomb.io - US1 Querying and ui.honeycomb.io - US1 Trigger & SLO Alerting, lasting 1d 1h. The incident has been resolved; the full update timeline is below.

Started
Oct 20, 2025, 02:16 PM UTC
Resolved
Oct 21, 2025, 03:41 PM UTC
Duration
1d 1h
Detected by Pingoru
Oct 20, 2025, 02:16 PM UTC

Affected components

ui.honeycomb.io - US1 Queryingui.honeycomb.io - US1 Trigger & SLO Alerting

Update timeline

  1. monitoring Oct 20, 2025, 02:16 PM UTC

    Due to constrained EC2 instance capacity in the AWS us-east-1 region, we are choosing to allocate the capacity we have to incoming events and telemetry. As such, customers may see over 5 minute delays in our processing of - SLOs - Service Maps We do not expect a degradation of our core ingest -> query flow, and we do not expect triggers to be impacted

  2. monitoring Oct 20, 2025, 03:14 PM UTC

    SLO processing has recovered. Service Maps continues to be degraded but historical data can be queried.

  3. monitoring Oct 20, 2025, 03:30 PM UTC

    The networking issues in us-east-1 is affecting our query engine - customers may see errors or delays when running queries as a result.

  4. monitoring Oct 20, 2025, 05:39 PM UTC

    We’re currently observing that querying is seeing signs of recovery. We are continuing to monitor the situation

  5. monitoring Oct 20, 2025, 08:06 PM UTC

    AWS is starting to show signs of recovery. Querying remains partially impacted and may take longer to return results. We are continuing to monitor the situation.

  6. monitoring Oct 20, 2025, 11:08 PM UTC

    Service maps are up and running and errors have resolved. Querying, triggers and SLOs are continuing to improve and customers may see delays as functionality continues to improve. We are continuing to monitor the situation.

  7. monitoring Oct 21, 2025, 12:11 AM UTC

    Honeycomb core functionality is operational. Service maps are up and running and errors have resolved. Querying, triggers and SLOs are continuing to improve and customers may see delays as functionality continues to improve. This is our final update for the night unless the situation degrades. We will continue to monitor the situation.

  8. resolved Oct 21, 2025, 03:41 PM UTC

    This incident has been resolved. At no point did we lose any customer data that hit our load balancers. Querying has been stable for hours, and all features that were degraded are functional.