DataRobot incident

Batch Jobs queued up in US MTS

DataRobot is currently experiencing a minor incident affecting Website and API and 1 more component, which began 2d ago. The vendor's full update timeline is below.

Started
May 29, 2026, 08:37 AM UTC
Resolved
Ongoing
Duration
● 1d 17h
Detected by Pingoru
May 29, 2026, 08:37 AM UTC

Affected components

WebsiteAPIPredictionsAutoMLAI Catalog and Data IngestAI AppsMLOpsPipelineNotebooksGenerative AI LLM Playground

Update timeline

  1. identified May 29, 2026, 08:37 AM UTC

    Batch jobs that are currently in the queue are completed successfully; however, their completion status is not being updated correctly. As a temporary workaround, we are manually marking these jobs as completed while we work on implementing a solution.

  2. identified May 29, 2026, 10:22 AM UTC

    The affected jobs have been resolved and the issue is mitigated. Engineering is actively working on a permanent fix and testing is currently underway

  3. monitoring May 29, 2026, 02:38 PM UTC

    The remediation script has been deployed and Engineering is actively monitoring the situation. Batch jobs may take longer than usual to show as completed until a permanent fix is rolled out in the next production deployment.

  4. monitoring May 29, 2026, 03:20 PM UTC

    We are continuing to monitor for any further issues.