UserVoice experienced a major incident on October 26, 2018 affecting Web Portal (subdomain) and Admin Console, lasting 43m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Oct 26, 2018, 05:30 PM UTC
We are investigating 520 and 502 errors in UserVoice admin consoles and on web portals.
- investigating Oct 26, 2018, 05:31 PM UTC
We are continuing to investigate this issue.
- monitoring Oct 26, 2018, 05:43 PM UTC
The system has recovered and operating normally. We'll be checking to determine root cause and follow with a post mortem.
- resolved Oct 26, 2018, 06:13 PM UTC
This incident has been resolved.
- postmortem Oct 31, 2018, 02:02 PM UTC
On October 26th between 10:20 and 10:35 PDT UserVoice experienced an infrastructure issue that caused intermittent system unavailability. ## Business Impact During the outage end-users and admins may have been unable to load or interact with UserVoice sites or widgets. ## Root Cause One of the servers in Uservoice's database cluster experienced an application stall event. Due to misconfiguration of our cluster, this caused a pause in database writes. Our engineers were required to manually remove the affected node to allow the cluster to resume writing. ## What we are Doing to Prevent This We have updated our database cluster configuration to more aggressively monitor and remove cluster members that are non-performant. This caused an interruption for your team in UserVoice and your end users trying to view and submit feedback. We are sorry for the pain point this caused. We have already put improvements in place to prevent this type of issue from happening again. If you have any questions, don’t hesitate to reach out at [[email protected]](mailto:[email protected]). Claire Talbott Support Manager