Treasure Data incident

[US region] Presto Query Engine - Degraded Performance

Minor Resolved View vendor source →

Treasure Data experienced a minor incident on September 4, 2024 affecting Presto Query Engine, lasting 14h 56m. The incident has been resolved; the full update timeline is below.

Started
Sep 04, 2024, 10:22 PM UTC
Resolved
Sep 05, 2024, 01:18 PM UTC
Duration
14h 56m
Detected by Pingoru
Sep 04, 2024, 10:22 PM UTC

Affected components

Presto Query Engine

Update timeline

  1. investigating Sep 04, 2024, 10:22 PM UTC

    We are investigating a possible problem currently affecting Presto. Queries could be delayed. We will provide an update as soon as we know more.

  2. monitoring Sep 04, 2024, 11:19 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. investigating Sep 05, 2024, 01:48 AM UTC

    This incident is still ongoing. We are investigating the root cause.

  4. investigating Sep 05, 2024, 03:53 AM UTC

    Though not all, the performance for some queries has been improved. We are continuing to investigate the issue.

  5. monitoring Sep 05, 2024, 04:44 AM UTC

    We applied the fix. We will continue to monitor the results.

  6. monitoring Sep 05, 2024, 08:28 AM UTC

    Systems should be back to normal but we continue to monitor the situation for a while.

  7. resolved Sep 05, 2024, 01:18 PM UTC

    The incident is now resolved. All affected components are back to normal. A subset of customers in the US region might have experienced degraded performance on Presto queries between 4:50 PM EDT and 1:40 AM EDT. Presto queries might also have been queued for longer than usual during the incident. Finally, some queries might have failed due to the remediations that were put in place.