Fluid Attacks incident

Degraded service availability

Minor Resolved View vendor source →

Fluid Attacks experienced a minor incident on December 31, 2025 affecting Platform and Agent, lasting 2d 8h. The incident has been resolved; the full update timeline is below.

Started
Dec 31, 2025, 05:02 AM UTC
Resolved
Jan 02, 2026, 01:53 PM UTC
Duration
2d 8h
Detected by Pingoru
Dec 31, 2025, 05:02 AM UTC

Affected components

PlatformAgent

Update timeline

  1. identified Jan 15, 2026, 01:45 PM UTC

    It was identified that the CI Gate and services that depend on the API are currently failing.

  2. resolved Jan 15, 2026, 01:47 PM UTC

    The incident has been resolved, and the affected services are now operating normally.

  3. postmortem Jan 15, 2026, 02:13 PM UTC

    **Impact** At least one user experienced service interruptions affecting the agent and other features that depend on the API. The issue started on UTC-5 25-12-29 22:25 and was proactively discovered 3.7 days \(TTD\) later by one of our monitoring tools, which reported failures in several system components. While investigating these alerts, the team identified that background processes required for the service to operate were not running correctly. The problem was resolved in 1.6 hours \(TTF\), resulting in a total window of exposure of 3.7 days \(WOE\) [\[1\]](https://gitlab.com/fluidattacks/universe/-/issues/19803). **Cause** A required access key was missing from the production configuration. The key had been removed by mistake, which caused background processes to fail, and as a result, any functionality that depended on them stopped working [\[2\]](https://gitlab.com/fluidattacks/universe/-/merge_requests/91520/diffs). **Solution** The missing access key was restored in the production configuration file, allowing the system to authenticate properly and resume normal operations [\[3\]](https://gitlab.com/fluidattacks/universe/-/merge_requests/91677). **Conclusion** Once the key was restored, all affected components recovered, and the service returned to normal. This incident emphasizes the importance of safeguarding critical configuration values to prevent widespread service disruptions. **INFRASTRUCTURE\_ERROR**