Iron.io incident

IronWorker Service degradation

Iron.io experienced a minor incident on February 2, 2017 affecting IronWorker Dedicated and IronWorker Public, lasting 1h 40m. The incident has been resolved; the full update timeline is below.

Started: Feb 02, 2017, 06:04 PM UTC
Resolved: Feb 02, 2017, 07:44 PM UTC
Duration: 1h 40m
Detected by Pingoru: Feb 02, 2017, 06:04 PM UTC

Affected components

IronWorker DedicatedIronWorker Public

Update timeline

investigating Feb 02, 2017, 06:04 PM UTC

We have identified a system slow down with IronWorker. This is a high priority issue, and our Operations team is actively investigating. We will post an update as soon as information becomes available.
investigating Feb 02, 2017, 06:36 PM UTC

The cause of the slowdowns is in a backend database under high load. We're searching for the source of the load and as soon as that's found we'll update this page. That said we've restarted the DB and it's mostly recovered so tasks should start flowing again. Stand by.
monitoring Feb 02, 2017, 06:47 PM UTC

The database has stabilized and we are sweeping projects for queued tasks. You may notice some tasks that ran long. Many of them probably finished just did not exit properly and mark as "complete". You can verify by viewing the task logs. We'll update again as we monitor the system.
monitoring Feb 02, 2017, 07:23 PM UTC

We've discovered the source of the database slowdown and are taking corrective actions to build a permanent fix into the system so this does not happen again. Thanks for your patience while we work through slower than usual queue times. The system did not lose jobs, but rather saw slowdowns in queue times.
resolved Feb 02, 2017, 07:44 PM UTC

The issue is now resolved. We're working on a limiting process for the API calls that were putting strain on our databases and should have something soon to avoid this from happening again in the future. If you have any specific questions please contact [email protected] or your dedicated support channel.