Qubole incident
Accessiblity and performance issue on api.qubole.com
Qubole experienced a minor incident on June 11, 2021 affecting Site Availability and QDS API and 1 more component, lasting 1d. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jun 11, 2021, 07:16 PM UTC
Api.qubole.com is currently seeing some degraded performance, and occasionally returning 404 errors during access. At this time failures appear to be partial and intermittent, but Devops is investigating.
- investigating Jun 12, 2021, 04:05 AM UTC
api.qubole.com is currently running slowly due to extremely high throughput, likely complicated by an initial issue with burst throttling (now resolved). Though the number of connections and backlogged operations is consistently coming down, we're seeing delays in the webapp's ability to intake requests and process them. New nodes being added are seeing stability issues in the webapp tier, not accepting traffic and failing.
- resolved Jun 12, 2021, 08:00 PM UTC
A large, ad-hoc workload running into unexpected errors drove an ongoing backlog of operations. The lagging operations have either been killed or finished, restoring regular performance. Devops is doing post-mortem monitoring of the environment.