Alkira incident

API service is returning errors

Minor Resolved View vendor source →

Alkira experienced a minor incident on April 16, 2024 affecting API Service and Network Provisioning Service, lasting 29m. The incident has been resolved; the full update timeline is below.

Started
Apr 16, 2024, 03:41 PM UTC
Resolved
Apr 16, 2024, 04:10 PM UTC
Duration
29m
Detected by Pingoru
Apr 16, 2024, 03:41 PM UTC

Affected components

API ServiceNetwork Provisioning Service

Update timeline

  1. investigating Apr 16, 2024, 03:41 PM UTC

    We are investigating an issue with our API service. If you log into the Alkira portal, you might see errors. The problem was first reported at 15:00 UTC.

  2. identified Apr 16, 2024, 03:44 PM UTC

    We have identified an issue causing the API service to intermittently restart, resulting in errors on the Alkira portal. These restarts might cause some of the APIs to fail; typically, a retry of the API will succeed. We are working towards fixing the issue.

  3. monitoring Apr 16, 2024, 03:57 PM UTC

    A constraint check locked up the database, resulting in intermittent API service restarts. We have fixed the issue and stabilized the API service. We will further monitor the service for any errors.

  4. resolved Apr 16, 2024, 04:10 PM UTC

    At 14:52 UTC, our internal alerts fired, indicating that API and Network provisioning services had restarted. There are multiple replicas for this service, and at a time, one or two replicas were restarting. Upon investigating, we identified a constraint check holding a lock on the database, resulting in errors in the service. As of 15:40 UTC, we identified and resolved the constraint check, after which the service stabilized. The issue seems to be fully resolved, and we will continue to monitor further for any other errors. We apologize for any inconvenience this may have caused.