Netdata incident

Nightly static builds overwrite node and metric data upon install

Major Resolved View vendor source →

Netdata experienced a major incident on April 1, 2025 affecting Agent (all platforms), lasting 1d. The incident has been resolved; the full update timeline is below.

Started
Apr 01, 2025, 09:00 AM UTC
Resolved
Apr 02, 2025, 09:25 AM UTC
Duration
1d
Detected by Pingoru
Apr 01, 2025, 09:00 AM UTC

Affected components

Agent (all platforms)

Update timeline

  1. investigating Apr 01, 2025, 09:00 AM UTC

    We have found that a recent change in the nightly static builds of Netdata Agent causes meta data on the Agent to be overwritten. Specifically the sqlite3 database that keeps meta data on which timeseries stored in dbengine correspond with which metrics, and the information on the Agent's "machine GUID" are overwritten with the same copy in the build package. Not affected are: - All stable releases - Native packages (.deb and .rpm) Affected are all nightly static builds with the following version numbers: - 2.3.0-50-nightly - 2.3.0-60-nightly - 2.3.0-72-nightly - 2.3.0-78-nightly - 2.3.0-87-nightly The initial impact is that all affected Agent installs, even though they still have the timeseries data stored on disk, have lost all meta data associated with it, so these timeseries become inaccessable. This is unrecoverable. Additionally, the main form of identification is overwritten, too. We are assessing what the impact is for users of Netdata Cloud, and will update this incident with more information when the investigation is completed. The bug itself has been fixed and merged. We will issue a new nightly build shortly.

  2. resolved Apr 02, 2025, 09:25 AM UTC

    Affected Agents can cause the creation of multiple duplicate nodes in Netdata Cloud. All but the last one will appear as offline, and the last one will be as if it was created from scratch with no data. Unfortunately, the previously stored metrics for the affected nodes can not be recovered. The duplicate offline nodes can be safely deleted from Space Settings -> Nodes. Note that you may have to add the newest copy of these nodes to the appropriate rooms. The fixed nightly static build is v2.3.0-102.