Voyado incident

[Engage] System Performance Degradation

Voyado experienced a minor incident on September 8, 2025 affecting API and Web Application and 1 more component, lasting 47m. The incident has been resolved; the full update timeline is below.

Started: Sep 08, 2025, 10:27 AM UTC
Resolved: Sep 08, 2025, 11:15 AM UTC
Duration: 47m
Detected by Pingoru: Sep 08, 2025, 10:27 AM UTC

Affected components

APIWeb ApplicationMessagingAutomations3rd Party Integrations

Update timeline

investigating Sep 08, 2025, 10:27 AM UTC

We are currently investigating indications of general slowness and degraded performance. Users may experience unusually long loading times and slow/unresponsive APIs, this may also affect processing times in automations and sendouts.
investigating Sep 08, 2025, 10:34 AM UTC

We are continuing to investigate the issue with top priority. We can see that many areas of the platform are currently affected by the disturbance, from slow/unresponsive APIs to delays in automations and message delivery.
identified Sep 08, 2025, 10:45 AM UTC

We have identified the issue and taken remediative actions. We are not yet at normal operations but we see indications that the delays are decreasing.
resolved Sep 08, 2025, 11:15 AM UTC

The remediative actions had the desired effect and we are now back at normal operations.
postmortem Sep 25, 2025, 11:51 AM UTC

## Summary On September 8th 2025, our platform experienced degraded performance due to an unexpected surge in traffic. This led to delays in processing and temporary errors across some services. ## Customer Impact During the incident, some customers encountered error responses from APIs and delays in automations, scheduled jobs, and outbound communications. All delayed messages were eventually processed and delivered once the issue was resolved. ## Root Cause and Mitigation The disruption was caused by an unexpected increase in usage patterns that placed a heavy load on the underlying systems. This created bottlenecks in message processing, resulting in delays and errors. The issue was resolved by redistributing system load, which restored normal operation. ## Next Steps We are reviewing system capacity and safeguards to ensure more resilience against sudden changes in usage. Improvements are planned in monitoring, load management, and isolation to reduce the likelihood of similar incidents in the future.