Boost Commerce incident

Sync system affected by AWS operational issue

Critical Resolved View vendor source →

Boost Commerce experienced a critical incident on September 19, 2023 affecting Data Sync, lasting 7h 7m. The incident has been resolved; the full update timeline is below.

Started
Sep 19, 2023, 12:46 AM UTC
Resolved
Sep 19, 2023, 07:53 AM UTC
Duration
7h 7m
Detected by Pingoru
Sep 19, 2023, 12:46 AM UTC

Affected components

Data Sync

Update timeline

  1. identified Sep 19, 2023, 12:46 AM UTC

    The sync system is being affected by on-going AWS cloud issue. Other components are working normally. The cloud incident status being reported here https://health.aws.amazon.com/health/status. https://downdetector.com/status/aws-amazon-web-services/

  2. identified Sep 19, 2023, 12:47 AM UTC

    We are continuing to work on a fix for this issue.

  3. identified Sep 19, 2023, 02:03 AM UTC

    Update from AWS: https://health.aws.amazon.com/health/status "Sep 18 6:56 PM PDT We continue to work toward resolving the increased networking latencies and errors affecting Availability Zones (usw2-az1 and usw2-az2) in the US-WEST-2 Region. While network mapping propagation times have remained stable, we have not yet seen the improvement in propagation latencies that we had hoped for. We continue to work on multiple paths to mitigation. We have completed the first phase to address the resource contention within this subsystem, and are actively working on the second and third phases in parallel. Our current plan is to implement these mitigations within a single AZ (usw2-az1), and verify recovery occurs before proceeding to perform these same mitigations in usw2-az2. We will continue to provide updates every 60 minutes, or as we have additional information to share." So while waiting for usw2-az1 and usw2-az2 to recover, we are working on deployment of our services on another AZs: usw2-az3.

  4. monitoring Sep 19, 2023, 03:23 AM UTC

    We're seeing the sync starting to work as normal. The store data are being resynced. We'll keep monitoring. "[08:01 PM PDT] We continue to work towards resolving the increased networking latencies and errors affecting Availability Zones (usw2-az1 and usw2-az2) in the US-WEST-2 Region. We continue to see improvement to network mapping latencies but they are not at normal levels yet. Other AWS services are also starting to see recovery as network mapping latencies improve. We will continue to keep you updated until network mapping latencies have returned to normal levels."

  5. resolved Sep 19, 2023, 07:53 AM UTC

    This incident has been resolved.