Hosted Mender incident

Emails from Hosted Mender are not being sent.

Minor Resolved View vendor source →

Hosted Mender experienced a minor incident on May 27, 2024 affecting Hosted Mender US and Hosted Mender EU, lasting 3h 31m. The incident has been resolved; the full update timeline is below.

Started
May 27, 2024, 06:50 PM UTC
Resolved
May 27, 2024, 10:22 PM UTC
Duration
3h 31m
Detected by Pingoru
May 27, 2024, 06:50 PM UTC

Affected components

Hosted Mender USHosted Mender EU

Update timeline

  1. investigating May 27, 2024, 06:50 PM UTC

    Customers report that emails from Hosted Mener are not being sent. We are currently investigating the issue.

  2. identified May 27, 2024, 07:16 PM UTC

    This issue has been identified and a fix is being implemented.

  3. monitoring May 27, 2024, 10:17 PM UTC

    A fix has been implemented and we're monitoring the results.

  4. resolved May 27, 2024, 10:22 PM UTC

    This incident has been resolved. You can now send emails again from Hosted Mender.

  5. postmortem May 28, 2024, 12:57 PM UTC

    On May 23rd, during a planned cluster update, we switched the Mender deployment from custom manifests with Kustomize to the official Mender Helm Chart. The Helm Chart uses a custom SMTP secret to load SMTP settings into variables, which could not be disabled. Previously, with the Kustomize deployment, we were also using a custom SMTP secret for the same purpose. The issue arose during the transition to the Helm Chart deployment because both the old and new SMTP secrets were retained. As a result, both secrets attempted to populate the SMTP environment variables, and the wrong secret prevailed, loading the predefined Helm Chart values. This caused the workflows service to fail in sending emails. After the switch to the Helm Chart deployment, we tested both manually and automatically most of the Mender features, but the email sending was not in the checklist, so we got alerted by customers. **Resolution** We applied a fix to the Helm Chart, allowing users and Hosted Mender to bring their existing SMTP secret to the workflows service, rather than creating an integrated one. **Root Cause:** Two competing SMTP secrets were populating the same SMTP variables with different values. **Improvement plan** We added the email sending feature to the post-upgrade checklist.