Envoy incident
New Users logging into mobile app are unable to get confirmation email
Envoy experienced a major incident on December 17, 2024 affecting Mobile, lasting 1d 3h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Dec 17, 2024, 09:20 PM UTC
We are aware of an issue where users are unable to receive the QR code confirmation email when logging into the mobile app. Our team is currently investigating the issue
- identified Dec 17, 2024, 09:42 PM UTC
Our team has identified the root cause of the secondary dashboard accessibility issue related to the mobile app & QR codes and is actively working on a resolution. We sincerely apologize for any inconvenience this may cause. If you have feedback or need assistance, please don't hesitate to contact Envoy Customer Support through the live chat messenger or by emailing us at [email protected].
- monitoring Dec 18, 2024, 12:07 AM UTC
A fix has been implemented, and we are actively monitoring to ensure the service remains stable without further disruption. We sincerely apologize for any inconvenience this may cause. If you have feedback or need assistance, please don't hesitate to contact Envoy Customer Support through the live chat messenger or by emailing us at [email protected].
- resolved Dec 18, 2024, 08:26 AM UTC
The incident was resolved on/around 5:00pm PT - we appreciate your patience as we monitored the issue to completion. For any additional questions, please contact Customer Support via the live chat messenger and/or [email protected].
- postmortem Jan 03, 2025, 04:19 PM UTC
On December 17 at 10:50 AM PST, an internal data task ran repeatedly and caused an outage in one of our core servers. This resulted in an overload in processing jobs, which impacted one of our endpoints tied to logins, and in turn, caused dashboard login issues. By 11:30 AM PST, our team had identified the responsible job and worked to scale the system by pausing a few jobs to restore dashboard availability. However, due to the significant backlog in the queue, emails and notifications were delayed as the system recovered. Throughout the day, our team continued to optimize and scale up, managing the workload until we eventually cleared the backlog and got everything back to normal by 4:30 PM PST. An internal post-mortem has been organized to analyze the incident and note key learnings, and action items. If you have any concerns or require additional information, please don't hesitate to contact our Customer Support team via [[email protected]](mailto:[email protected]) or contact your dedicated account manager.