Authzed incident

Serverless Platform Ingress outage

Critical Resolved View vendor source →

Authzed experienced a critical incident on September 13, 2022, lasting —. The incident has been resolved; the full update timeline is below.

Started
Sep 13, 2022, 01:10 AM UTC
Resolved
Sep 13, 2022, 01:10 AM UTC
Duration
Detected by Pingoru
Sep 13, 2022, 01:10 AM UTC

Update timeline

  1. resolved Sep 13, 2022, 01:44 AM UTC

    A configuration change was applied to the SpiceDB Serverless platform's primary network ingress. The configuration was immediately noticed and rolled back as soon as possible. There was approximately 7 minutes of outage between 9:10PM EDT and 9:17PM EDT. The root cause was a slight version difference between Contour in our production and staging environments. Resolution was delayed by a few minor but vital workflows that collectively consumed time: - External ingress health-checking that runs on a larger interval than our internal health-checks - Lack of log retention of crash-looped pods Kubernetes making `kubectl logs` workflow less ideal We will be improving our process for vetting configuration to make sure that this cannot possibly happen again.