Netdata incident

Delay in processing node availability changes

Minor Resolved View vendor source →

Netdata experienced a minor incident on December 6, 2022 affecting Agent - Cloud Connection (ACLK) and Agent (all platforms), lasting 3h 14m. The incident has been resolved; the full update timeline is below.

Started
Dec 06, 2022, 02:35 PM UTC
Resolved
Dec 06, 2022, 05:50 PM UTC
Duration
3h 14m
Detected by Pingoru
Dec 06, 2022, 02:35 PM UTC

Affected components

Agent - Cloud Connection (ACLK)Agent (all platforms)

Update timeline

  1. identified Dec 06, 2022, 05:06 PM UTC

    We've identified an issue with delayed processing of node availability (online, stale, offline) changes. For a fraction of our users this means that these changes are not reflected properly in Netdata Cloud. As the availability affects what metrics are shown in Cloud, it may be that some metrics are not visible even though the node is supposed to be available.

  2. identified Dec 06, 2022, 05:07 PM UTC

    We are working back the backlog of availability updates and should be done in about 30 minutes.

  3. monitoring Dec 06, 2022, 05:34 PM UTC

    The backlog has been consumed. We are monitoring the situation.

  4. resolved Dec 06, 2022, 05:50 PM UTC

    This incident has been resolved.