HUIT incident

Regulated Data (ReD) Environment: GPU Partition Issues

Notice Resolved View vendor source →

HUIT experienced a notice incident on May 15, 2026 affecting Other Services, lasting 4d 19h. The incident has been resolved; the full update timeline is below.

Started
May 15, 2026, 07:37 PM UTC
Resolved
May 20, 2026, 03:07 PM UTC
Duration
4d 19h
Detected by Pingoru
May 15, 2026, 07:37 PM UTC

Affected components

Other Services

Update timeline

  1. investigating May 15, 2026, 07:37 PM UTC

    The Regulated Data (ReD) Environment is currently experiencing issues with provisioning and dispatching GPU jobs. AWS has confirmed limited GPU availability for the instance types used by the main GPU partitions. The Univ RCD Services team is investigating options to increase availability and restore normal GPU job processing. The team will continue to monitor and provide updates.

  2. resolved May 20, 2026, 03:07 PM UTC

    Temporary adjustments have successfully restored GPU access and have confirmed that all resources are reachable.