Oso incident

Intermittent HTTP 401 errors in us-east-1

Notice Resolved View vendor source →

Oso experienced a notice incident on January 15, 2025, lasting 1h 8m. The incident has been resolved; the full update timeline is below.

Started: Jan 15, 2025, 08:51 PM UTC
Resolved: Jan 15, 2025, 10:00 PM UTC
Duration: 1h 8m
Detected by Pingoru: Jan 15, 2025, 08:51 PM UTC

Update timeline

resolved Jan 15, 2025, 08:51 PM UTC

From 22:03 - 22:26 UTC, requests to tasks deployed on the corrupted container instance incorrectly failed with HTTP 401 status codes. The corrupted instance was terminated at 22:26 at which point service was restored. The container instance was corrupted due to a race condition triggered by how AWS ECS launches tasks on container instances. We have since made changes to safeguard against this race condition.