Qubole experienced a critical incident on April 22, 2021 affecting QDS API and Command Processing and 1 more component, lasting 5d 7h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Apr 22, 2021, 12:52 PM UTC
Spark and Presto queries run in in.qubole.com may stall, returning Pending or Queued status. Devops is investigating.
- investigating Apr 22, 2021, 02:56 PM UTC
We are continuing to investigate this issue.
- monitoring Apr 23, 2021, 06:07 PM UTC
Devops is monitoring its latest fix -- this should be resolved. Additional information about the resolution will be added after monitoring.
- monitoring Apr 25, 2021, 11:44 AM UTC
We are continuing to monitor for any further issues.
- monitoring Apr 25, 2021, 11:47 AM UTC
An additional incidence of stalled operations was reported yesterday evening (4/24), which have since cleared. Devops is looking into a root cause for the stall, so that a more permanent fix can be applied.
- monitoring Apr 26, 2021, 05:51 AM UTC
We are continuing to monitor for any further issues.
- resolved Apr 27, 2021, 08:45 PM UTC
Devops expects operational issues to be resolved. After restarting discovery, they needed to augment client nodes to serve the scope of traffic.