Roadie incident

Core Platform Service Degradation

Roadie experienced a minor incident on June 6, 2025 affecting Core Platform, lasting 38m. The incident has been resolved; the full update timeline is below.

Started: Jun 06, 2025, 11:03 PM UTC
Resolved: Jun 06, 2025, 11:41 PM UTC
Duration: 38m
Detected by Pingoru: Jun 06, 2025, 11:03 PM UTC

Affected components

Core Platform

Update timeline

investigating Jun 06, 2025, 11:03 PM UTC

On June 6th 2025 at approximately 6:45pm EST, our platform began experiencing a brief service degradation. An unexpected resource constraint on one of our internal servers resulted in a degradation of services, affecting gig creation. While the system remains stable overall, this degradation is affecting gig creation and general performance. We will keep you informed as we gather more information about the cause and scope of this incident. Roadie Support and Operations teams will be available to help facilitate any delivery challenges you experience due to the incident. Roadie appreciates your patience and apologizes for the inconvenience. If you have any questions or concerns please reach out to your Roadie Account Manager or [email protected].
investigating Jun 06, 2025, 11:25 PM UTC

Roadie engineers are continuing to investigate this issue.
identified Jun 06, 2025, 11:27 PM UTC

Roadie engineers have identified the issue, implemented a fix, and are monitoring the platform closely.
resolved Jun 06, 2025, 11:41 PM UTC

The incident has been resolved and core platform services are restored to normal functioning. We appreciate your patience and apologize for the inconvenience.
postmortem Jun 10, 2025, 04:44 AM UTC

**Service Disruption Summary – June 6th 2025** On June 6th 2025 beginning at approximately 6:45pm EDT, our platform experienced a partial service degradation. An unexpected resource constraint on one of our internal servers resulted in a degradation of services, affecting gig creation and overall performance. The underlying issue was identified quickly, but resolution required safely restarting all affected services which was completed by 7:41pm EDT. To reduce the likelihood of similar disruptions in the future, we have implemented the following measures: * Adjusted workload requests/limits to avoid further resource constraints. * Added additional alerting on the affected workloads. We apologize for any disruption this may have caused. If you have any questions, please feel free to reach out to us at [[email protected]](mailto:[email protected]).