Alkira incident

API calls timing out

Notice Resolved View vendor source →

Alkira experienced a notice incident on October 11, 2023, lasting —. The incident has been resolved; the full update timeline is below.

Started
Oct 11, 2023, 11:04 PM UTC
Resolved
Oct 10, 2023, 12:00 PM UTC
Duration
Detected by Pingoru
Oct 11, 2023, 11:04 PM UTC

Update timeline

  1. resolved Oct 11, 2023, 11:04 PM UTC

    Summary of the Problem From 5AM to 9AM UTC on October 10, 2023, we detected a surge in database queries. Initially, the system managed the load, but by 9AM, the increased load caused API and Network provisioning calls to time out, triggering our internal alerts. Our investigation traced the surge to the Cloud Insights service. Deactivating this service at 11:15 AM UTC restored the affected services. We identified an inefficient query within Cloud Insights as the cause. Findings The inefficient query within the Cloud Insights service placed undue load on the database, affecting API operations. Corrective Actions We took the Cloud Insights service offline at 11:15 AM UTC, which stabilized the API and Network provisioning service. We rectified the problematic query within Cloud Insights. Next Steps To avoid similar incidents, we have moved the Cloud Insights service to its own database, ensuring it won't interfere with the API or Network Provisioning service. We have implemented and deployed all improvements. We sincerely regret any inconvenience. For further queries, please contact Alkira Support.