Census incident

Sync execution halted

Critical Resolved View vendor source →

Census experienced a critical incident on August 23, 2024 affecting Sync Engine, lasting 1h 33m. The incident has been resolved; the full update timeline is below.

Started
Aug 23, 2024, 05:41 PM UTC
Resolved
Aug 23, 2024, 07:15 PM UTC
Duration
1h 33m
Detected by Pingoru
Aug 23, 2024, 05:41 PM UTC

Affected components

Sync Engine

Update timeline

  1. identified Aug 23, 2024, 05:41 PM UTC

    Production sync execution has been paused while we investigate an issue

  2. identified Aug 23, 2024, 05:58 PM UTC

    We have identified an issue which could cause records to be synced multiple times to destinations. Since this can affect the correctness of syncs for certain destinations, we have paused all syncs proactively until the issue can be diagnosed and resolved. We are continuing to investigate and are in the act of rolling back to a previous version of the sync engine.

  3. identified Aug 23, 2024, 06:42 PM UTC

    We have determined that the bug does not affect all syncs, nor does it cause records to be sent multiple times - this was a red herring. A high percentage of syncs are still failing, but there is not a correctness issue, so we are reenabling the sync engine globally so that syncs that are not affected by the issue can return to operation.

  4. identified Aug 23, 2024, 06:48 PM UTC

    We have identified the likely root cause and are deploying a fix at this time

  5. monitoring Aug 23, 2024, 07:04 PM UTC

    Sync success rates have returned to expected levels. There is some delay in sync execution as we work through the backlog during the outage period. We are continuing to monitor

  6. resolved Aug 23, 2024, 07:15 PM UTC

    This incident has been resolved.