Dremio incident

Engine degradation in multiple AWS regions

Minor Resolved View vendor source →

Dremio experienced a minor incident on September 18, 2023 affecting Site Availability, lasting 11h 7m. The incident has been resolved; the full update timeline is below.

Started
Sep 18, 2023, 08:33 PM UTC
Resolved
Sep 19, 2023, 07:41 AM UTC
Duration
11h 7m
Detected by Pingoru
Sep 18, 2023, 08:33 PM UTC

Affected components

Site Availability

Update timeline

  1. identified Sep 18, 2023, 08:33 PM UTC

    Customers may experience issues spinning up engines in AWS in several regions (US East 1 and US West 2). Requests to start engines may experience greater latency and/or timeouts. Details on the underlying service outage can be viewed here: https://health.aws.amazon.com/health/status We will update as mitigations are announced.

  2. identified Sep 19, 2023, 12:14 AM UTC

    AWS report mitigation of issues in US-EAST-1. Customers in US-EAST-1 should see normal system operation. US-WEST-2 is partially mitigated and AWS is continuing to progress towards completing the necessary updates. We will update when US-WEST-2 is fully operational again.

  3. identified Sep 19, 2023, 02:57 AM UTC

    The latest update from AWS at Sep 18 7:27 PM PDT, Sep 19 2:27:00 AM UTC (30 minutes prior to this update) regarding US-WEST-2 indicates early signs of improvement as well as estimated recovery over the next two hours. We're monitoring for additional hourly updates from them.

  4. monitoring Sep 19, 2023, 07:27 AM UTC

    The incident has been resolved by AWS. We are monitoring for impacts and rebound.

  5. resolved Sep 19, 2023, 07:41 AM UTC

    We've verified engines are behaving normally in the previously impacted AWS region. Service is operating normally.