Bigeye incident

Scheduled metric run degradation

Minor Resolved View vendor source →

Bigeye experienced a minor incident on February 15, 2024 affecting API and UI, lasting 5d 6h. The incident has been resolved; the full update timeline is below.

Started
Feb 15, 2024, 08:05 PM UTC
Resolved
Feb 21, 2024, 02:58 AM UTC
Duration
5d 6h
Detected by Pingoru
Feb 15, 2024, 08:05 PM UTC

Affected components

APIUI

Update timeline

  1. investigating Feb 15, 2024, 08:05 PM UTC

    We are investigating issues with invalid metric results for some metric configurations. We are temporarily pausing scheduled runs while we are investigating.

  2. investigating Feb 15, 2024, 08:36 PM UTC

    Scheduled metric runs have been re-enabled. We are still investigating issues with invalid metric results for data time window grouped metrics.

  3. identified Feb 15, 2024, 09:38 PM UTC

    The issue has been identified and a fix is being implemented.

  4. identified Feb 16, 2024, 02:55 AM UTC

    We have fixed the problem and are correcting metrics that are missing portions of metric run history with an ETA of tomorrow (Feb 16).

  5. identified Feb 17, 2024, 01:43 AM UTC

    The forward fix has been deployed and verified. All systems are operational. We are continuing to reprocess any historical metric runs which were affected by this issue

  6. identified Feb 20, 2024, 06:17 PM UTC

    We are continuing to reprocess the remaining affected historical metric runs

  7. resolved Feb 21, 2024, 02:58 AM UTC

    The incident has been resolved. All services are back online and healthy.