Omni Analytics incident

Slow Queries

Major Resolved View vendor source →

Omni Analytics experienced a major incident on April 8, 2024 affecting EastUsa, lasting 10m. The incident has been resolved; the full update timeline is below.

Started
Apr 08, 2024, 04:55 PM UTC
Resolved
Apr 08, 2024, 05:05 PM UTC
Duration
10m
Detected by Pingoru
Apr 08, 2024, 04:55 PM UTC

Affected components

EastUsa

Update timeline

  1. investigating Apr 08, 2024, 09:06 PM UTC

    Queries in the EastUsa region are slow and/or delayed.

  2. resolved Apr 08, 2024, 09:06 PM UTC

    For approximately 10 minutes starting around 9:55am US/Pacific time, queries in the EastUsa region ran slower than normal. A spike of queries against an information schema where the cache had expired triggered a thundering herd effect. That combined with some missing logic in our auto scaling caused queues to back up. We will be working to address the issue in the auto scaling logic in the next few days and will also work to change our caching logic to avoid any further thundering herd effects.