Datto incident

Datto RMM - Zinfandel - Job and Audit Execution Delays

Minor Resolved View vendor source →
Started
Mar 06, 2026, 08:45 PM UTC
Resolved
Mar 06, 2026, 10:04 PM UTC
Duration
1h 19m
Detected by Pingoru
Mar 06, 2026, 08:45 PM UTC

Affected components

Zinfandel (US West)

Update timeline

  1. investigating Mar 06, 2026, 08:45 PM UTC

    We are aware of a problem where Jobs and Audits are experiencing delays on the Zinfandel Platform. The Kaseya R&D Team Investigating the issue. Subscribe to the Kaseya Status Page for up-to-date information at https://status.kaseya.com/

  2. monitoring Mar 06, 2026, 09:04 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Mar 06, 2026, 10:04 PM UTC

    This incident has been resolved.

  4. postmortem Mar 18, 2026, 10:18 AM UTC

    **Summary** Around **2026-03-06 9:40 AM EST,** partners on the Zinfandel platform started experiencing delays in both Quick and Scheduled Job execution. The issue was initially mitigated by 12:00 PM EST; however, the steps taken to restore service inadvertently caused the problem to reoccur later that afternoon at approximately 2:13 PM EST. The R&D and Operations teams fully resolved the issue by 2:47 PM EST. **Root Cause and Resolution** The initial incident was triggered by an unusually large-scale alert resolution operation, which created a significant backlog of processing tasks within the database. The high volume of queued work caused processing times to exceed the allowable execution window. This resulted in repeated retries, which continually saturated the database and prevented other operations from running normally. To alleviate the load, the task scheduling service was scaled down, and the queuing services were recycled, which reduced database pressure and restored normal operation by approximately 12:00 PM EST. However, at around 2:13 PM EST, these earlier mitigation steps produced an unintended side effect: they constrained the throughput of the service responsible for processing device audits. This limitation caused additional downstream delays in Job execution across the platform. The service was subsequently scaled back up to full capacity, and all services were confirmed healthy by **2:47 PM EST**. **Preventative Measures** To reduce the likelihood and impact of similar incidents in the future, the following steps are being taken: * **Resolution of Related Product Issues:**` `The R&D team has identified a backend software defect that contributed to the incident. A fix is scheduled for the **14.9 release**. * **Enhanced Monitoring, Alerting, and Response:**` `The Kaseya R&D team is reviewing additional monitoring capabilities to provide deeper insight into application performance at the component level for key services. * **Improved Incident Management and Response:**` `Global Kaseya teams will continue to receive training and coaching on Incident Management playbooks to ensure that all internal stakeholders are promptly informed and take coordinated action when events occur.

Looking to track Datto downtime and outages?

Pingoru polls Datto's status page every 5 minutes and alerts you the moment it reports an issue — before your customers do.

  • Real-time alerts when Datto reports an incident
  • Email, Slack, Discord, Microsoft Teams, and webhook notifications
  • Track Datto alongside 5,000+ providers in one dashboard
  • Component-level filtering
  • Notification groups + maintenance calendar
Start monitoring Datto for free

5 free monitors · No credit card required