Convercent incident
Convercent platform intermittent SSO log in failures
Convercent experienced a major incident on January 8, 2024 affecting EU Production and US Production, lasting 1d 23h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jan 08, 2024, 04:41 PM UTC
We’ve identified an issue where some users may experience log in failures via SSO authentication. Impact is limited to users accessing the service through the U.S and E.U production environments. If you are experiencing issues, please contact our support teams quoting IM-370. We’re investigating the issue and will provide an update as soon as possible.
- investigating Jan 08, 2024, 05:18 PM UTC
Impact is limited to customers using SAML encryption to login via SSO authentication. If you are experiencing issues, please contact our support teams quoting IM-370.
- monitoring Jan 08, 2024, 06:35 PM UTC
To resolve the issue, affected customers are required to update metadata within their SSO certificate settings. Please contact our support teams quoting IM-370 for further assistance.
- monitoring Jan 08, 2024, 07:38 PM UTC
To resolve the issue, affected customers are required to update metadata within their SSO certificate settings. Please contact our support teams quoting IM-370 for further assistance.
- monitoring Jan 09, 2024, 09:10 PM UTC
For customers who use single sign-on (SSO) with encryption enabled, an additional step is required to restore full functionality. These customers should update the new certificate metadata based on the certificate updated on the site. Impacted customers are being contacted by our support teams.
- resolved Jan 10, 2024, 04:02 PM UTC
The issue has been resolved. If you are still experiencing issues, please contact our support teams quoting IM-370. Root cause analysis investigations have been initiated and an RCA will be provided.
- postmortem Jan 29, 2024, 08:08 AM UTC
# Event Description Between Saturday 06th and Monday 08th January 2024, customers using the OneTrust Convercent platform experienced ‘Invalid username and password’ notifications when attempting to access their production and trial environments, impact extended to new case creation that could not be reported due to submission failures. # Findings and Root Cause Upon engagement, engineering teams identified the root cause as a missing configuration with a recent certificate update. The configuration was updated which resolved the incident for customers that were not using SSO encryption. Customers that did have SSO encryption enabled were required to update certificate metadata and were contacted by OneTrust support teams to assist them with this process. There are three certificates that require regular updates. The process relies upon manual intervention and one of the certificate configurations was overlooked during the update process. **How could this incident have been avoided?** Automating the certificate update process or the creation of a checklist if a manual process is continued. **How could we have detected the issue sooner?** There are proactive alerts in place, however, these were not executed due to a configuration issue. **Is there a contingency or plan to control future incidents of this kind?** We are exploring the possibility of automating the process. Until such time, a checklist has been created to assist engineers with the process. **If related to a change, why was it not discovered in testing?** The infrastructure certificate updates are not subjected to testing. # Corrective Actions * Update the missing configuration on the certificate * Add checklist to be used for expired certification rotations