Beyond Identity incident

Windows Release 2.84.1 Unable to Launch

Major Resolved View vendor source →

Beyond Identity experienced a major incident on August 16, 2023, lasting —. The incident has been resolved; the full update timeline is below.

Started
Aug 16, 2023, 05:05 PM UTC
Resolved
Aug 07, 2023, 09:06 PM UTC
Duration
Detected by Pingoru
Aug 16, 2023, 05:05 PM UTC

Update timeline

  1. resolved Aug 16, 2023, 05:05 PM UTC

    The Windows Platform Authenticator release 2.84.1 was broken. After upgrading to that version, the authenticator was unable to launch.

  2. postmortem Aug 16, 2023, 05:06 PM UTC

    # Executive Summary ## Summary On August 7th, 2023, at 4:06 PM CDT, Beyond Identity released a hotfix to address the Windows Desktop Login issue. The build pipeline encountered an issue, and we released a broken build. Once upgraded to the broken build 2.84.1, the Windows Platform Authenticator didn’t launch due to a missing reference. This prevented users from using the product entirely. The initial after-release to production testing detected the issue, and we reverted the release at 5:14 PM CDT. A service bulletin was released on August 8th at 11:44 AM to address the issue. The impact continued to the users who had received the upgrade dialog but had not reacted to it. If they chose to upgrade even after we had reverted the release, it could still install the broken version. If the user chose to dismiss the dialog, they were not impacted. The broken binaries were removed from storage services on August 8th at 1:06 PM CDT preventing the installation of the broken version if the dialog was still open. Once the hotfix 2.84.2 was released on August 8th at 10:02 AM CDT, the fix was to upgrade to that version. Before the fixed version was released, the fix was to uninstall and re-install the previous version. The service bulletin document was updated accordingly. ## Impacted Customers This issue impacted all customers using Windows Platform Authenticator \(not Windows Desktop Login version\) without using version control. ## External Links * [Service Bulletin #230 - Windows Release 2.84.1 Issue – Beyond Identity](https://support.beyondidentity.com/hc/en-us/articles/16602800489367-Service-Bulletin-230-Windows-Release-2-84-1-Issue) # Incident Details ## Leadup Beyond Identity released a hotfix 2.84.1 to address the Windows Desktop Login enrollment issue, which also required a change in the Windows Platform Authenticator — both WDL and Windows PA releases were broken. ## Fault Windows Platform Authenticator did not launch at all. The users were unable to use the product at all. ## Detection The support team validated the WDL fix as a post-release check and detected that the Platform Authenticator didn’t launch after upgrading. ## Root causes * Build pipeline errored with dependencies. * Rust dependency generation error caused the build pipeline to pull the wrong library version. * The manifest and downloads page was updated automatically after the binaries were released as a part of the hotfix release flow. * When we identified the issue, reverting the release did not include removing the broken binaries. * A broken build got released. ## Mitigation and resolution * The release was reverted from the release manifest. * The download page was reverted. * Hotfix 2.84.2 was released. * Broken binaries were removed from production. * A service bulletin was released. ## Lessons learned * The hotfix release process needs to have a stop-gap between publishing binaries and releasing the manifest and download page update. \([SRE-1721](https://beyondidentity.atlassian.net/browse/SRE-1721)\) * Reverting the release needs to prevent access to the reverted release-related binaries. \([SRE-1720](https://beyondidentity.atlassian.net/browse/SRE-1720)\) * Identified a new test case for the QA team to perform in the pre-release environment. \([BIT-1566](https://beyondidentity.atlassian.net/browse/BIT-1566)\) * The statuspage needs to be updated to have more detailed services listed. \([SUPO-820](https://beyondidentity.atlassian.net/browse/SUPO-820)\) ## Timeline 2023-08-07 4:06 PM CDT - Build pipeline completed the release and updated manifest and download page 2023-08-07 4:16 PM CDT - An issue with the build was identified 2023-08-07 4:20 PM CDT - The engineering team started investigating the issue 2023-08-07 4:58 PM CDT - The issue was reproduced, and the build was identified as defective 2023-08-07 5:05 PM CDT - The site reliability engineering team dispatched to revert the release 2023-08-07 5:14 PM CDT - Release manifest reverted 2023-08-07 5:32 PM CDT - Downloads page reverted 2023-08-07 7:27 PM CDT - Root cause identified and fix implemented 2023-08-07 9:47 PM CDT - The QA team started the release testing process 2023-08-08 10:02 AM CDT - Hotfix 2.84.2 was released 2023-08-08 11:44 AM CDT - Service bulletin released ## Metrics Time to Detection - 10 minutes Time to SME investigation - 4 minutes Time to Repair - 54 minutes Time to Recovery - 17h 56 minutes