SecureAuth incident

SA IdP Dashboard Data is not Updating

Minor Resolved View vendor source →

SecureAuth experienced a minor incident on May 12, 2023, lasting 8h 43m. The incident has been resolved; the full update timeline is below.

Started
May 12, 2023, 05:32 PM UTC
Resolved
May 13, 2023, 02:15 AM UTC
Duration
8h 43m
Detected by Pingoru
May 12, 2023, 05:32 PM UTC

Update timeline

  1. investigating May 12, 2023, 05:32 PM UTC

    The Dashboard data for the SA IdP is not updating for login information. This is not impacting any SA IdP services or logins, only the reporting mechanism.

  2. identified May 12, 2023, 06:28 PM UTC

    The issue was an expired certificate within the backend services. The certificate has been updated and the job to update the dashboards will run later this afternoon to update the dashboards. Dashboards should be updated at roughly 1900 PST.

  3. monitoring May 12, 2023, 06:28 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved May 13, 2023, 02:15 AM UTC

    Dashboards are again populating with data - if you continue to have issues, please contact [email protected]

  5. postmortem May 16, 2023, 10:38 PM UTC

    We apologize for any inconvenience experienced during this incident, and we treat such disruptions seriously. Below is a summary of the issue, a root cause, and what we are doing to improve service going forward. ‌ Incident Description A TLS certificate that is used on an endpoint for SA IdP dashboard metric data collection expired on May 9. This meant HTTPS connections to this endpoint failed and caused SA IdP dashboard metric data to become outdated, degrading service for part of the administrative console. After some customers reported an issue with outdated metrics, the SecureAuth team investigated and fixed the issue by renewing the TLS certificate. The new certificate was in place by 6:10 PM on May 12. Data was refreshed around 1:30 AM on May 13, allowing the normal metric refresh period to run its course. Root Cause The DevOps team recently replaced several expiring certificates, and this one was missed due to a monitoring gap. Corrective Actions The DevOps team is following up by ensuring the TLS certificate monitoring gap is addressed and by reviewing additional monitoring requirements for the endpoint in question. Note: All times/dates are UTC.