Alkira incident

Health reporting issue with few CXP regions

Minor Resolved View vendor source →

Alkira experienced a minor incident on October 15, 2024 affecting Connector Health Monitoring Service, lasting 14m. The incident has been resolved; the full update timeline is below.

Started
Oct 15, 2024, 10:55 PM UTC
Resolved
Oct 15, 2024, 11:09 PM UTC
Duration
14m
Detected by Pingoru
Oct 15, 2024, 10:55 PM UTC

Affected components

Connector Health Monitoring Service

Update timeline

  1. investigating Oct 15, 2024, 10:55 PM UTC

    We are actively investigating an issue with the health reporting service in the following CXPs. We will update as we know the root cause the problem. aws-ca-central-1 azure-northcentralus gcp-northamerica-northeast2 gcp-us-central1 gcp-us-east1

  2. identified Oct 15, 2024, 10:56 PM UTC

    The issue has been identified, and we are working towards recovering the service in the mentioned regions.

  3. resolved Oct 15, 2024, 11:09 PM UTC

    One of the underlying nodes serving health monitoring services had failed. While the infrastructure is auto-scalable the failover took longer than expected, resulting in the health monitoring service not reporting the tunnel health status for a few minutes. Currently, the infrastructure is stable, and no issues are observed with any of the services. Reach out to Alkira support for any queries.