DataRobot Outage History

DataRobot is up right now

DataRobot had 32 outages in the last 2 years totaling 82h 16m of downtime — averaging 1.3 incidents per month.

There were 32 DataRobot outages since June 3, 2025 totaling 82h 16m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.datarobot.com

Major September 9, 2025

Users in APAC region are experiencing network issues when accessing application endpoints in US MTS

Detected by Pingoru
Sep 09, 2025, 05:39 AM UTC
Resolved
Sep 10, 2025, 06:03 PM UTC
Duration
1d 12h
Affected: Website
Timeline · 5 updates
  1. investigating Sep 09, 2025, 05:39 AM UTC

    We've identified that some users in Japan are experiencing an intermittent login issue when trying to access the US MTS. The engineering team is currently investigating.

  2. investigating Sep 09, 2025, 11:26 AM UTC

    Users in APAC are experiencing network issues when accessing application endpoints in US MTS. Engineering is investigating potential ISP-related networking issues in APAC region. Affected customers are encouraged to try using a different network and/or browsers to mitigate the issue.

  3. monitoring Sep 09, 2025, 02:02 PM UTC

    Users can now access US MTS cluster from APAC region and issue recovered. The engineering team will continue monitoring this issue.

  4. monitoring Sep 09, 2025, 06:14 PM UTC

    We are continuing to monitor for any further issues.

  5. resolved Sep 10, 2025, 06:03 PM UTC

    This incident has been resolved.

Read the full incident report →

Major September 2, 2025

Serverless Prediction Servers are impacted when an a new deployment is created or an existing deployment is modified.

Detected by Pingoru
Sep 02, 2025, 03:37 PM UTC
Resolved
Sep 02, 2025, 06:07 PM UTC
Duration
2h 30m
Affected: PredictionMLOps
Timeline · 2 updates
  1. investigating Sep 02, 2025, 03:37 PM UTC

    We've identified an issue impacting serverless deployments when a new deployment is created or an existing deployment is modified, in Japan cluster. Prediction against existing deployments deployed to serveless prediction server that are not modified are not impacted. Engineering currently working on a fix to address the issue.

  2. resolved Sep 02, 2025, 06:07 PM UTC

    Engineering has applied the fix to mitigate the issue impacting serverless deployments when a new deployment is created or an existing deployment is modified. The problem has now been contained.

Read the full incident report →

Notice July 29, 2025

Some of the customers are not receiving scheduled deployment reports.

Detected by Pingoru
Jul 29, 2025, 11:51 AM UTC
Resolved
Jul 29, 2025, 10:09 PM UTC
Duration
10h 17m
Affected: MLOpsMLOpsMLOps
Timeline · 3 updates
  1. investigating Jul 29, 2025, 11:51 AM UTC

    Some of the customers are not receiving scheduled deployment reports. The data can still be observed in the application UI. The engineering is actively working on restoring the functionality. Please check the below-mentioned link for docs for generating report: https://docs.datarobot.com/en/docs/mlops/governance/deploy-reports.html#generate-a-deployment-report

  2. identified Jul 29, 2025, 02:32 PM UTC

    Engineering has identified the root cause and prepared a fix. The team has now started deploying the fix to MTS production.

  3. resolved Jul 29, 2025, 10:09 PM UTC

    This incident has been resolved.

Read the full incident report →

Critical July 7, 2025

Interface is down for all customer.

Detected by Pingoru
Jul 07, 2025, 03:42 PM UTC
Resolved
Jul 07, 2025, 04:20 PM UTC
Duration
38m
Affected: WebsiteAPIPredictionsAutoMLAI Catalog and Data IngestAI AppsMLOpsPipelineNotebooksGenerative AI LLM PlaygroundGenerative AI VDB Builder
Timeline · 3 updates
  1. identified Jul 07, 2025, 03:42 PM UTC

    Starting 11:05 EST today, Interface is down for all customers in the US Prod due to internal error. Engineering has identified the problem and is in the process of applying the fix.

  2. monitoring Jul 07, 2025, 04:12 PM UTC

    The fix has been applied successfully and the interface has been restored. Engineering is currently monitoring the system.

  3. resolved Jul 07, 2025, 04:20 PM UTC

    The fix has resolved the interface failure issue. DataRobot App in US Prod is fully functioning as normal.

Read the full incident report →

Notice June 12, 2025

DataRobot Experiencing Stability Issues

Detected by Pingoru
Jun 12, 2025, 08:35 PM UTC
Resolved
Jun 13, 2025, 01:30 AM UTC
Duration
4h 55m
Timeline · 3 updates
  1. identified Jun 12, 2025, 08:27 PM UTC

    External cloud providers are experiencing outages which is impacting the DataRobot platform and services.

  2. monitoring Jun 12, 2025, 08:35 PM UTC

    Our engineering team will continue to monitor the outages reported by external cloud providers.

  3. resolved Jun 13, 2025, 02:37 AM UTC

    Our engineering team confirmed that the outage with our external cloud provider has been resolved. We have been monitoring our functionalities and no specific functionality was affected.

Read the full incident report →

Notice June 12, 2025

Managed EU AI Cloud Degraded Performance

Detected by Pingoru
Jun 12, 2025, 12:23 PM UTC
Resolved
Jun 12, 2025, 08:00 AM UTC
Duration
Timeline · 1 update
  1. resolved Jun 12, 2025, 12:23 PM UTC

    Managed EU AI Cloud had a degraded performance during 7:00 - 7:30 AM UTC. Some users could experience intermittent connection interruptions. Please reach out to [email protected] if you have any questions.

Read the full incident report →

Notice June 3, 2025

Deployment drift and accuracy charts are broken

Detected by Pingoru
Jun 03, 2025, 09:43 AM UTC
Resolved
Jun 04, 2025, 01:13 PM UTC
Duration
1d 3h
Timeline · 3 updates
  1. investigating Jun 03, 2025, 09:43 AM UTC

    Root cause is identified and team is working on the fix

  2. identified Jun 03, 2025, 01:16 PM UTC

    Team is working on implementing the fix

  3. resolved Jun 04, 2025, 01:13 PM UTC

    Issue is now resolved and the drift charts are working as expected

Read the full incident report →