Templafy incident

Service degradation - Third party Add-ins packages failed to download

Major Resolved View vendor source →

Templafy experienced a major incident on October 23, 2024 affecting Add-in Management, lasting 3h 41m. The incident has been resolved; the full update timeline is below.

Started
Oct 23, 2024, 10:21 AM UTC
Resolved
Oct 23, 2024, 02:03 PM UTC
Duration
3h 41m
Detected by Pingoru
Oct 23, 2024, 10:21 AM UTC

Affected components

Add-in Management

Update timeline

  1. identified Oct 23, 2024, 10:21 AM UTC

    We have identified an issue that affects a subset of customers and are working towards a resolution. Further updates will be posted here soon.

  2. monitoring Oct 23, 2024, 01:14 PM UTC

    The incident has been successfully mitigated, and our team is actively monitoring the situation to ensure ongoing stability and performance. The third-party Add-ins will be reactivated during the next synchronization of our Templafy Desktop component.

  3. resolved Oct 23, 2024, 02:03 PM UTC

    The incident has been resolved, and further information will be provided in a postmortem shortly. We apologize for the impact to affected customers.

  4. postmortem Oct 25, 2024, 01:33 PM UTC

    **Incident Initiation** To improve the automation and security of package management, the engineering team implemented a series of changes. These updates included expanding the status options for each package and involved a database migration. However, the changes unintentionally caused the endpoint responsible for retrieving current packages for users to incorrectly exclude third-party packages. Consequently, these packages were removed from users' computers during the next Templafy Desktop run, making the associated Add-ins unavailable in Office applications. This change was deployed across all environments at 8:20 AM CET on October 23, 2024, though it only started impacting users after the Templafy Desktop component triggered a package update. The issue was first reported at 11:44 AM CET. **Investigation** The incident was placed at 12:10 PM CET, when the software engineering team was notified. The root cause was quickly identified and the engineering team immediately started implementing a fix to ensure the endpoint responsible for retrieving Add-ins correctly included third-party packages. **Impact and Scope** The incident impacted all regions but affected only specific tenants with third-party Add-ins installed. Among these, only users whose Templafy Desktop triggered a package update within the incident timeframe were affected. **Resolution** By 3:14 PM CET, the code fix was successfully deployed across all production environments. This update allowed the Templafy Desktop component to retrieve and reinstall third-party Add-ins, enabling users to access them again following a Templafy Desktop synchronization. **Post-Incident Actions** The engineering team has commenced an extensive review to expand test cases for package delivery, activation, and security assertion, ensuring future enhancements to package functionality do not impact the integrity and stability of existing configurations.