Archbee incident

Partial outage

Major Resolved View vendor source →

Archbee experienced a major incident on April 20, 2025 affecting API and Content Management System and 1 more component, lasting 2h 10m. The incident has been resolved; the full update timeline is below.

Started
Apr 20, 2025, 12:36 PM UTC
Resolved
Apr 20, 2025, 02:47 PM UTC
Duration
2h 10m
Detected by Pingoru
Apr 20, 2025, 12:36 PM UTC

Affected components

APIContent Management SystemHosted Documentation Sites

Update timeline

  1. investigating Apr 20, 2025, 12:36 PM UTC

    We are currently investigating the issue with EU servers and potentially system wide.

  2. investigating Apr 20, 2025, 01:18 PM UTC

    We have determined the root cause of the issue and we're deploying a fix.

  3. resolved Apr 20, 2025, 02:47 PM UTC

    We have resolved the issue. We are sorry for the inconvenience and we're taking steps to prevent this from ever happening again. We are writing a postmortem and releasing the notes soon.

  4. postmortem Apr 20, 2025, 02:47 PM UTC

    We use a hosted/managed database service from Azure. Because we are careful with customer data, and want to fulfill our end to end encryption promise, we use SSL when connecting to any database, this one included. A few weeks ago, we saw a notification about a SSL certificate change in our of our test databases, and we made the SSL certificate change for that specific database. Because we were 100% sure that the same thing will happen soon with our main database, we made a change preemptively while making sure multiple times that the change is backwards compatible, using both certificates in the certificate chain. After deploying the change, our systems went on undisturbed for over two weeks. Today, while having a planned database update from Azure, the system suddenly stopped working due to the new change we had made.