Voyado incident

[Engage] Delay in Processing of Sendouts

Voyado experienced a minor incident on September 28, 2025 affecting Messaging, lasting 1h 39m. The incident has been resolved; the full update timeline is below.

Started: Sep 28, 2025, 08:34 AM UTC
Resolved: Sep 28, 2025, 10:14 AM UTC
Duration: 1h 39m
Detected by Pingoru: Sep 28, 2025, 08:34 AM UTC

Affected components

Messaging

Update timeline

investigating Sep 28, 2025, 08:34 AM UTC

We are currently experiencing delays with sendouts affecting a subset of customers. We are investigating and will provide updates as soon as we have more information.
monitoring Sep 28, 2025, 09:13 AM UTC

We identified the issue that was causing message delays for a subset of customers. The affected messages are now being processed, and we will continue to monitor closely until all messages have been sent.
resolved Sep 28, 2025, 10:14 AM UTC

The delayed messages have been processed and we are back to normal operations.
postmortem Oct 13, 2025, 12:56 PM UTC

## Summary On September 28th, Engage experienced delays in sending bulk email messages. The issue was caused by part of our internal email processing setup not operating as expected. The delay was resolved the same day, and no messages were lost. ## Customer Impact Bulk emails for a minor sub-set of customers were delayed by up to 3 hours. Priority messages, including automation emails and SMS, were not affected and were delivered as normal. ## Root Cause Three of the internal services responsible for processing emails were not active due to a system update on September 27th. This caused them to restart without automatically reactivating the needed components. Combined with a high traffic in the system, this led to a backlog in the email messaging pipeline. ## Mitigation The inactive services were redeployed, restoring full processing capacity. Once this was done, the message queue was quickly processed, and operations returned to normal. ## Next Steps To prevent similar issues in the future, we are: * Updating our deployment pipeline to ensure the required components are automatically restarted after updates. * Introducing additional monitoring and alerts to detect inactive or underperforming internal services sooner. We appreciate your understanding and patience while we worked through this issue and apologize for any inconvenience.