LogDNA incident
User sessions are timing out and customers are required to login again
LogDNA experienced a minor incident on June 19, 2023 affecting Web App, lasting 2h 48m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jun 19, 2023, 11:09 AM UTC
User sessions to our Web UI are timing out and customers using the UI have to log in every 1-2 minutes. We are investigating why this is happening, but the rest of the service is fully functional. No other components are affected.
- identified Jun 19, 2023, 11:13 AM UTC
The issue has been identified, and a fix is being implemented.
- monitoring Jun 19, 2023, 11:27 AM UTC
The fix was implemented and we are now monitoring the user login sessions.
- resolved Jun 19, 2023, 01:58 PM UTC
This incident has been resolved.
- postmortem Jun 28, 2023, 06:30 PM UTC
**Dates:** Start Time: Monday, June 19, 2023, at 10:31 UTC End Time: Monday, June 19, 2023, at 12:35 UTC Duration: 124 minutes **What happened:** Users were being logged out of our WebUI frequently – within 1-2 minutes of logging in. Users could successfully login again, but the new session would also expire quickly. **Why it happened:** The cache of logged in users held in our Redis database was being cleared every 1-2 minutes. This caused all user sessions to expire and new logins to be required. We have yet to ascertain why the cache was being periodically cleared at frequent intervals. **How we fixed it:** We restarted the pods running the Redis database and the cache behavior returned to normal. **What we are doing to prevent it from happening again:** We will investigate further to learn why the Redis cache was being frequently cleared.