Seven Senders incident

Tracking API is down

Critical Resolved View vendor source →

Seven Senders experienced a critical incident on July 25, 2025 affecting Tracking and Shop SQS, lasting 1d 12h. The incident has been resolved; the full update timeline is below.

Started
Jul 25, 2025, 11:02 AM UTC
Resolved
Jul 26, 2025, 11:23 PM UTC
Duration
1d 12h
Detected by Pingoru
Jul 25, 2025, 11:02 AM UTC

Affected components

TrackingShop SQS

Update timeline

  1. investigating Jul 25, 2025, 11:02 AM UTC

    We are currently investigating this issue.

  2. investigating Jul 25, 2025, 02:21 PM UTC

    We have a major outage in one of our services, which affects Shipments tracking, including Tracking Page, Tracking API, Analytics, and Parcel Finder. We have identified the cause and are trying to restore the application.

  3. investigating Jul 26, 2025, 04:49 PM UTC

    A fix has been implemented, and we are currently monitoring the applications.

  4. monitoring Jul 26, 2025, 04:49 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Jul 26, 2025, 11:23 PM UTC

    This incident has been resolved.

  6. postmortem Jul 29, 2025, 12:11 PM UTC

    # Incident Statement: Tracking API Service Disruption **Date:** July 25th through 26th **Duration:** 32 hours \+ 6 hours cleanup **Impact:** Tracking API and Tracking Pages request timeouts due to database performance issues ## What Happened On July 25th, our tracking subsystem experienced service disruption due to the unintentional deletion of a critical unique index on our main database. This index is essential for processing of tracking-related queries. ## Impact on Services The loss of this index caused database queries to take much longer than normal, resulting in requests failing with timeouts. ## Affected services * Tracking API * Tracking Pages * Parcel Finder ## Resolution Steps Taken Our engineering team responded immediately to restore service: 1. **Initial Response:** Initiated rebuilding the deleted unique index 2. **Load Management:** Stopped unindexed queries 3. **Infrastructure Scaling:** Scaled up the database cluster to speed up indexing and reduce database load 4. **Interim Solution:** Created a non-unique index to allow queries to work properly when the rebuilding process failed due to duplicates that had been created in the meantime 5. **Data Cleanup:** Cleaned up duplicate entries 6. **Full Recovery:** Successfully recreated the unique index, restoring normal service ## Preventive Measures To prevent similar incidents in the future, we are implementing the following measures: * Enhanced database change management procedures with mandatory peer review for all schema modifications * Implementation of automated database integrity checks to detect missing or corrupted indexes before before production deployment * Enhanced code review processes specifically focused on database schema changes and queries without index * Improved standard operating procedures for common database incident scenarios ## Our Commitment We sincerely apologize for the disruption this incident caused. We take full responsibility for this issue and have conducted a thorough post-incident review. We are committed to implementing the preventive measures outlined above to ensure the reliability and stability of our services. If you have any questions or concerns regarding this incident, please don't hesitate to contact our support team. Thank you for your patience.