Iron.io incident

Public cluster push queues are delayed but still sending

Minor Resolved View vendor source →

Iron.io experienced a minor incident on January 18, 2017 affecting IronMQ v3 (AWS US-East), lasting 9h 3m. The incident has been resolved; the full update timeline is below.

Started
Jan 18, 2017, 04:03 PM UTC
Resolved
Jan 19, 2017, 01:07 AM UTC
Duration
9h 3m
Detected by Pingoru
Jan 18, 2017, 04:03 PM UTC

Affected components

IronMQ v3 (AWS US-East)

Update timeline

  1. identified Jan 18, 2017, 04:03 PM UTC

    We’re working on resolving the issues as quickly as possible.

  2. identified Jan 18, 2017, 09:29 PM UTC

    We are preparing to deploy fixes to improve the push queue delays, please stand by.

  3. resolved Jan 19, 2017, 01:07 AM UTC

    The issue has been resolved and all messages in push queues have now been flushed and delivered. Note: no messages were lost, but many were significantly delayed in their delivery to endpoints. Additionally, we've identified and patched the root causes that have been causing delays in push queues over the past few weeks. We've found some edge cases around users creating hundreds of thousands of push queues with no subscribers causing the push processor to get caught up working on nothing for a long time causing delays for other customers. We patched things so that the push processor will bypass queues in those cases. This should significantly improve push queue performance. We apologize for any application issues this may have caused. We take these issues seriously and are working around the clock to improve the service across the board for users of all paid tiers. That said, if you need guaranteed reliability and performance, and SLA's in place, please contact us about our enterprise dedicated cluster offering. To discuss dedicated clusters, or anything else about this incident, you can contact us through our in-app messenger, or by emailing [email protected].