Zonos incident

2023-05-03 Issue with Quoter

Critical Resolved View vendor source →

Zonos experienced a critical incident on May 3, 2023 affecting Quoter, lasting 47m. The incident has been resolved; the full update timeline is below.

Started
May 03, 2023, 02:09 PM UTC
Resolved
May 03, 2023, 02:57 PM UTC
Duration
47m
Detected by Pingoru
May 03, 2023, 02:09 PM UTC

Affected components

Quoter

Update timeline

  1. investigating May 03, 2023, 02:24 PM UTC

    We are experiencing a service outage with Quoter. Our team is currently working to restore service. We apologize for any inconvenience. All users may be affected. We will provide an additional update within an hour.

  2. identified May 03, 2023, 02:35 PM UTC

    The issue has been identified and a fix is being implemented.

  3. identified May 03, 2023, 02:35 PM UTC

    We are continuing to work on a fix for this issue.

  4. monitoring May 03, 2023, 02:46 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. monitoring May 03, 2023, 02:46 PM UTC

    We are continuing to monitor for any further issues.

  6. monitoring May 03, 2023, 02:55 PM UTC

    We are continuing to monitor for any further issues.

  7. resolved May 03, 2023, 02:57 PM UTC

    This incident has been resolved.

  8. postmortem May 03, 2023, 05:30 PM UTC

    **What products were affected and what was the impact?** Dashboard Quoter Impact: CRITICAL **What timeframe did this issue occur?** | | **Date** | **Time** | | --- | --- | --- | | From: | May 2nd, 2023 | 15:30 MST | | To: | May 3rd, 2023 | 08:35 MST | **How was the issue detected?** A developer was using [dashboard.zonos.com](http://dashboard.zonos.com/) to make a quote and discovered it was broken. ‌ **What functionality was affected?** 100% of Quoter requests failed. ‌ **What problems did this cause?** Customers using Quoter were unable to get quotes. ‌ **What was the resolution of the problem and steps that are being taken for continued follow-up?** It was discovered that a deployment of Zonos Dashboard had a missing environment variable. The missing environment variable was added to the deployment, and Quoter functionality was restored. This was validated both via server logs and manual testing. **What mitigation solutions will we put in place to prevent this issue from occurring in the future?** * We have improved our testing procedures to catch similar deployment issues in QA before the deployment reaches production. * We will no longer allow a build if an environment variable is missing. * We have modified our release schedule to allow for greater support coverage around releases. * We are working to improve alerting for issues related to Dashboard functionality and deployments.