Templafy incident
Service degradation - Manually configured claims in SAML2 appear to no longer be coming to User Profiles in Templafy
Templafy experienced a major incident on September 24, 2024 affecting User Management, lasting 7h 45m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Sep 24, 2024, 12:54 PM UTC
We have identified an issue that affects a subset of customers and are working towards a resolution. Further updates will be posted here soon.
- monitoring Sep 24, 2024, 08:33 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Sep 24, 2024, 08:40 PM UTC
The incident has been resolved, and further information will be provided in a postmortem shortly. We apologize for the impact to affected customers.
- postmortem Sep 26, 2024, 12:19 PM UTC
Incident Initiation: The issue started on September 20, 2024, at 10:54 AM CET but was detected on September 23, 2024, at 6:02 PM CET. On September 24, 2024, at 2:37 PM CET, the incident response process was initiated after users reported that manually configured User claims in the SAML2 authentication protocol within the Hive environment were no longer being recognized. The issue caused users to lose their User Claims when logging in via SAML2 authentication, due to a mismatch between customer configurations and the system's case-sensitive claims checks. Investigation: The issue affected only a subset of customers, making it difficult to reproduce and investigate at first. Engineers closely examined the system and identified that the problem was caused by an external package update that implicitly upgraded the "Microsoft.IdentityModel.\*" packages from version 7.x to 8.x. This update introduced case sensitivity in claims checking, which conflicted with how some customers had configured their SAML2 claims. Once the root cause was pinpointed, the team began working on a solution to mitigate the issue. Resolution: After identifying the root cause, the configuration was adjusted to account for the case-sensitive claims checks. Normal functionality was restored, and users could resume all affected operations, resolving the incident on September 24, 2024, at 10:31 PM CET. Post-Incident Actions: Moving forward, we will focus on improving our monitoring and testing processes for package updates. Closer collaboration with customers and better detection mechanisms will help us identify and resolve similar issues more efficiently.