Treasure Data incident

[US Region] Query Engine - Service Degraded Performance

Minor Resolved View vendor source →

Treasure Data experienced a minor incident on October 2, 2024 affecting Streaming Import REST API and Mobile/Javascript REST API and 1 more component, lasting 1h 25m. The incident has been resolved; the full update timeline is below.

Started
Oct 02, 2024, 12:04 AM UTC
Resolved
Oct 02, 2024, 01:30 AM UTC
Duration
1h 25m
Detected by Pingoru
Oct 02, 2024, 12:04 AM UTC

Affected components

Streaming Import REST APIMobile/Javascript REST APIData Connector IntegrationsHadoop / Hive Query EnginePresto Query EnginePresto JDBC/ODBC Gateway

Update timeline

  1. investigating Oct 02, 2024, 12:04 AM UTC

    We're experiencing an elevated level of API errors and are currently looking into the issue.

  2. investigating Oct 02, 2024, 12:06 AM UTC

    We are continuing to investigate this issue.

  3. investigating Oct 02, 2024, 12:07 AM UTC

    We are continuing to investigate this issue.

  4. identified Oct 02, 2024, 12:41 AM UTC

    The issue has been identified and a fix is being implemented.

  5. monitoring Oct 02, 2024, 01:01 AM UTC

    A fix has been implemented and we are monitoring the results.

  6. monitoring Oct 02, 2024, 01:15 AM UTC

    We are continuing to monitor for any further issues.

  7. resolved Oct 02, 2024, 01:30 AM UTC

    This incident has been resolved.

  8. postmortem Oct 02, 2024, 04:34 AM UTC

    We experienced a temporary overload on the storage layer. It started from 16:15 PDT and fixed on 18:15 PDT. The major impact was performance defgadation for data ingestion components \(Streaming Import REST API, Mobile/Javascript REST API, Data Connector\) and Hive and Presto query engines. Some of queries executed on Hive and Presto failed because of performance degradation of the storage.