Auvik experienced a major incident on July 25, 2025 affecting us3.my.auvik.com, lasting 1h 49m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jul 25, 2025, 07:24 PM UTC
Collectors have been disconnected in us3 since 14:35 ET (18:35 UTC). We are investigating the issue.
- monitoring Jul 25, 2025, 08:27 PM UTC
We have restarted the affected systems, and collectors are beginning to reconnect. We are monitoring the recovery closely.
- resolved Jul 25, 2025, 09:13 PM UTC
The incident has been resolved.
- postmortem Jul 30, 2025, 04:27 PM UTC
# Service Disruption - Collectors offline for clients on the US3 cluster ## Root Cause Analysis ### Duration of incident Discovered: Jul 25, 2025 14:45 - UTC Resolved: Jul 25, 2025 17:54 - UTC ### Cause Backend nodes were removed from the US3 cluster during a routine cleanup effort intended to optimize efficiencies.. This removal unintentionally included the backend hosting the root tenant, leading to disconnected collectors within our US3 cluster. ### Effect Collectors in the US3 cluster lost connectivity with customer sites, resulting in disruptions to data collection and monitoring services. This caused temporary gaps in visibility across affected environments. ### Action taken _All times are in UTC_ **07/25/2026** **14:45** - Engineering notices that collector connections are beginning to fail. **18:28** – Tenants not loading observed by the team. **18:35** – Outage reports increase. **18:40** – SEV declared, and the root cause investigation begins. **18:48** – Backends re-added to balance load. **19:00** – Alternate issues ruled out. **19:23** – Root tenant backend identified as missing. **19:25** – Cluster restart initiated. **20:18** – Services begin recovery. **21:07** – Incident resolved. ### Future consideration\(s\) * Strengthen the backend removal process to confirm the root tenant is excluded.