LogDNA incident

User sessions are timing out and customers are required to login again

Minor Resolved View vendor source →

LogDNA experienced a minor incident on June 19, 2023 affecting Web App, lasting 2h 48m. The incident has been resolved; the full update timeline is below.

Started
Jun 19, 2023, 11:09 AM UTC
Resolved
Jun 19, 2023, 01:58 PM UTC
Duration
2h 48m
Detected by Pingoru
Jun 19, 2023, 11:09 AM UTC

Affected components

Web App

Update timeline

  1. investigating Jun 19, 2023, 11:09 AM UTC

    User sessions to our Web UI are timing out and customers using the UI have to log in every 1-2 minutes. We are investigating why this is happening, but the rest of the service is fully functional. No other components are affected.

  2. identified Jun 19, 2023, 11:13 AM UTC

    The issue has been identified, and a fix is being implemented.

  3. monitoring Jun 19, 2023, 11:27 AM UTC

    The fix was implemented and we are now monitoring the user login sessions.

  4. resolved Jun 19, 2023, 01:58 PM UTC

    This incident has been resolved.

  5. postmortem Jun 28, 2023, 06:30 PM UTC

    **Dates:** Start Time: Monday, June 19, 2023, at 10:31 UTC End Time: Monday, June 19, 2023, at 12:35 UTC Duration: 124 minutes **What happened:** Users were being logged out of our WebUI frequently – within 1-2 minutes of logging in. Users could successfully login again, but the new session would also expire quickly. **Why it happened:** The cache of logged in users held in our Redis database was being cleared every 1-2 minutes. This caused all user sessions to expire and new logins to be required. We have yet to ascertain why the cache was being periodically cleared at frequent intervals. **How we fixed it:** We restarted the pods running the Redis database and the cache behavior returned to normal. **What we are doing to prevent it from happening again:** We will investigate further to learn why the Redis cache was being frequently cleared.