Kustomer incident
[Satisfaction Surveys] CSATs may not send for voice, email and sms channels - Prod 1
Kustomer experienced a minor incident on January 27, 2026 affecting CSAT, lasting 1d 3h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jan 27, 2026, 07:16 PM UTC
Kustomer is aware of an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is currently working to identify the cause of this issue in an effort to implement a resolution. Please expect additional updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- investigating Jan 27, 2026, 07:27 PM UTC
Kustomer is aware of an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is currently working to identify the cause of this issue in an effort to implement a resolution. Please expect additional updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- investigating Jan 27, 2026, 07:31 PM UTC
Kustomer is aware of an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is currently working to identify the cause of this issue in an effort to implement a resolution. Please expect additional updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- investigating Jan 27, 2026, 07:58 PM UTC
Kustomer is aware of an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is continuing to work to identify the cause of this issue in an effort to implement a resolution. Please expect additional updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- investigating Jan 27, 2026, 08:29 PM UTC
Kustomer is aware of an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is continuing investigate the cause of this issue in an effort to implement a resolution. Please expect additional updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- investigating Jan 27, 2026, 09:03 PM UTC
Kustomer is aware of an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is continuing to work to investigate the cause of this issue in an effort to implement a resolution. Please expect additional updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- identified Jan 27, 2026, 09:32 PM UTC
Kustomer has identified an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is currently working to implement a resolution. Please expect further updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- identified Jan 27, 2026, 09:51 PM UTC
Kustomer has identified an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is continuing to work to implement a resolution. Please expect further updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- identified Jan 27, 2026, 10:23 PM UTC
Kustomer has identified an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team continues to work to implement a resolution. Please expect further updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- identified Jan 27, 2026, 11:01 PM UTC
Kustomer has identified an event affecting Satisfaction surveys that may cause surveys to not be sent, once conversations are marked done within the platform. Our team is working to implement a resolution. Please expect further updates within the next 30 minutes, and reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- monitoring Jan 28, 2026, 12:16 AM UTC
The team has identified the problem and mitigations have been applied. Jobs are gradually catching up and the team continues to monitor.
- investigating Jan 28, 2026, 10:05 PM UTC
Kustomer has resolved an event affecting Satisfaction Surveys that may cause Surveys to not be sent. To resolve this issue, our team has released an update. After careful monitoring, our team has determined that our systems are now fully restored, but our engineering team is still redriving surveys that did not originally send. During this redriving period lingering issues may still be present. Please reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- resolved Jan 28, 2026, 11:10 PM UTC
Kustomer has resolved an event affecting Satisfaction Surveys that may cause Surveys to not be sent. To resolve this issue, our team has released an update. After careful monitoring, our team has determined that our systems are now fully restored, but our engineering team is still redriving surveys that did not originally send. During this redriving period lingering issues may still be present. Please reach out to Kustomer support at [email protected] if you have additional questions or concerns.
- postmortem Feb 04, 2026, 09:50 PM UTC
# Post Mortem: Chat, Voice Routing, CSAT, Oauth, Scheduled Send Issues # **Summary** On January 22, 2024, customers experienced chat and voice conversations failing to route to agents due to a recent change in the assistant service. This triggered cascading failures across multiple backend services, degrading platform performance for several orgs. On January 27th, a subsequent incident occurred due to our scheduled jobs queue being flooded by assistant service jobs. This caused some CSAT Surveys to not send, Oauth connections to fail to refresh, and scheduled messages to not send. **Root Cause** A recent change to the assistant backend service increased the maximum workflow loops before transferring a conversation to an available agent. This allowed a single rate-limited WhatsApp conversation to become stuck in an infinite retry loop while attempting to transfer. The transfer requests themselves were also rate-limited, overwhelming shared infrastructure \(e.g. “job” engine which is shared by CSAT\) and causing service degradation across multiple orgs. # **Timeline** **Jan 22, 2026** 1:33 PM – Users began experiencing errors with chat and voice conversations failing to route to agents 1:44 PM – Engineers identified the problematic deployment and initiated rollback across all environments 1:54 PM – Rollback completed across all environments; on-call engineers continued monitoring system status 2:02 PM – Full assistant functionality restored for all customers 6:07 PM - Customers begin to report Oauth connections that failed to refresh **Jan 24, 2026** 6:24 PM - Customers begin to report that scheduled messages were not sending **Jan 27, 2026** 1:00 PM - Customers begin to report that CSAT surveys are not being sent 4:00 PM - Identified cause of CSAT scheduled job processing delays 6:00 PM - Started a script to manually increase processing throughput of scheduled jobs **Jan 28, 2026** 12:08 PM - Deployed code change to programmatically increase processing throughput of scheduled jobs 3:49 PM - Backlog of all delayed jobs processed and system restored **Lessons/Improvements** * Implementing new alerts to detect when scheduled job processing falls behind, enabling faster identification of similar issues * Improving alert prioritization to reduce noise and ensure critical alerts are acted upon immediately * Enhancing monitoring for downstream service dependencies * Evaluating queue architecture changes to prevent a single conversation from impacting other customers \("noisy neighbor" isolation\) * Investigating improvements to make our job scheduling service more resilient to backlogs * Creating documentation of all services that depend on scheduled jobs to better understand incident ripple effects