CloudAMQP Outage History

CloudAMQP partial outage · 1 active incident View live status →

There were 13 CloudAMQP outages since February 3, 2026 totaling 57h 4m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.cloudamqp.com

Major April 28, 2026

Backend slow/unresponsible

Detected by Pingoru
Apr 28, 2026, 10:27 AM UTC
Resolved
Apr 28, 2026, 11:05 AM UTC
Duration
37m
Affected: Backend
Timeline · 4 updates
  1. investigating Apr 28, 2026, 10:27 AM UTC

    We are currently experiencing issues with our backend services handling account and server creation. This does not affect any running customer servers but might delay provisioning new servers.

  2. identified Apr 28, 2026, 10:49 AM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Apr 28, 2026, 10:53 AM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Apr 28, 2026, 11:05 AM UTC

    We had an abnormal increase in traffic causing timeouts and slow responses for customer.cloudamqp.com, we increased capacity and are back to normal. This did not affect any running clusters, just provisioning and configuration updates.

Read the full incident report →

Major April 24, 2026

Connection issues in Azure region westus

Detected by Pingoru
Apr 24, 2026, 09:36 AM UTC
Resolved
Apr 24, 2026, 10:05 AM UTC
Duration
28m
Affected: Microsoft Azure
Timeline · 2 updates
  1. investigating Apr 24, 2026, 09:36 AM UTC

    We are getting indications of instances being down in Azure region westus. We are investigating the issue.

  2. resolved Apr 24, 2026, 11:12 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor April 23, 2026

Issues with Azure West US 3 region

Detected by Pingoru
Apr 23, 2026, 10:10 AM UTC
Resolved
Apr 23, 2026, 10:56 AM UTC
Duration
45m
Affected: Microsoft Azure
Timeline · 2 updates
  1. investigating Apr 23, 2026, 10:10 AM UTC

    Servers in Azure West US 3 can experience connection issues due to underlying issues at provider. We monitor and will update when we know more.

  2. resolved Apr 23, 2026, 10:56 AM UTC

    This incident has been resolved.

Read the full incident report →

Notice April 23, 2026

Metrics integration issues

Detected by Pingoru
Apr 23, 2026, 03:30 AM UTC
Resolved
Apr 23, 2026, 06:00 AM UTC
Duration
2h 29m
Affected: Metrics
Timeline · 4 updates
  1. investigating Apr 23, 2026, 04:33 AM UTC

    There appears to be an issue regarding sending metrics to datadog. We are currently investigating the issue

  2. monitoring Apr 23, 2026, 06:53 AM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Apr 23, 2026, 11:32 AM UTC

    This incident has been resolved.

  4. postmortem Apr 24, 2026, 08:38 AM UTC

    Broker metrics and metrics integrations degradation Window: 2026-04-23 03:25 – 05:35 UTC \(2h 10m\) ## Impact Broker metrics \(connections, channels, queues, consumers, message rates, node and netsplit state\) were degraded, progressively dropping until recovery at 05:35 UTC. Downstream effects: * Queue, Consumer, Connection, Channel, Connection flow, and Netsplit alarms: threshold breaches during the window may not have triggered, or triggered late. Alarms already tripped before 03:25 UTC kept their state. * Metrics Integrations \(Datadog, CloudWatch, New Relic, Splunk, Dynatrace, etc.\): delivery of broker metrics was reduced for the duration of the window. Alarms configured on the receiving side may likewise have missed or lagged. Not affected: server metrics \(CPU, memory, disk\) and their corresponding alarms, Notice alarms, broker availability, and the metric graphs shown in the CloudAMQP console \(these use a separate data path and continued to update normally\). ## Timeline \(UTC\) * 03:25 — broker-metrics collection begins degrading * 03:35 — ~30% of broker-metric samples missing * 03:50 — ~50% plateau * 05:25 — collector workers cycled and re-initialised * 05:35 — full throughput restored ## Root cause Broker metrics are polled from each cluster's management HTTP API by a pool of collector workers and published to an internal message bus that both the Alarms service and Metrics Integrations consume from. At 03:25 UTC this service became unresponsive without raising an error. Affected workers silently stopped polling while remaining alive to the platform, so automatic restart did not trigger. On-call staff began investigating around 05:00 UTC; force-restarting the service restored metric flow. The exact trigger has not been identified. Our focus is on ensuring the condition is detected and handled promptly if it recurs. ## What we are changing * Added metrics and alarms specific to this service, with high-urgency paging for on-call. * Reliability improvements in the service itself to detect and restart stalled work. ## What you can do CloudAMQP offers a new generation of metrics integrations based on Prometheus, with a re-engineered pipeline that does not rely on these centralised services — each server forwards data directly to your endpoint. These have been running in production for some time and have proved very reliable. * [https://www.cloudamqp.com/blog/prometheus-metrics-integrations.html](https://www.cloudamqp.com/blog/prometheus-metrics-integrations.html) * [https://www.cloudamqp.com/docs/monitoring\_metrics\_datadog\_v3.html](https://www.cloudamqp.com/docs/monitoring_metrics_datadog_v3.html) * Background: [https://www.cloudamqp.com/blog/decentralized-observability-with-open-telemetry-part-1.html](https://www.cloudamqp.com/blog/decentralized-observability-with-open-telemetry-part-1.html)

Read the full incident report →

Notice April 15, 2026

Monitoring Systems Short Outage and "Connection issues" Alarms

Detected by Pingoru
Apr 15, 2026, 03:32 PM UTC
Resolved
Apr 15, 2026, 02:00 PM UTC
Duration
Timeline · 1 update
  1. resolved Apr 15, 2026, 03:32 PM UTC

    One of our backend components suffered a small outage due to an unexpected full services restart. Connectivity with the servers were broken and service triggered alarms for many clusters. We apologize for the inconvenience. All clusters were operational and healthy within the incident/alarms timespan.

Read the full incident report →

Minor April 2, 2026

Management Interface Unavailable - 502 Bad Gateway Response

Detected by Pingoru
Apr 02, 2026, 02:41 PM UTC
Resolved
Apr 02, 2026, 03:12 PM UTC
Duration
31m
Affected: Shared serversDedicated servers
Timeline · 4 updates
  1. identified Apr 02, 2026, 02:41 PM UTC

    Customers are experiencing Management UI being unavailable with 502 Bad Gateway. Timeline: - 14:16 UTC -> The issue was identified - 14:30 UTC -> Recent code changes were reverted. No relationship with incident noted.

  2. identified Apr 02, 2026, 02:52 PM UTC

    The Management UI and brokers are serving traffic normally. You can still use the Management UI by using the Broker URL and properly typing User and Password. The impact is related to our backend service logs into Management UI via CloudAMQP SSO.

  3. identified Apr 02, 2026, 03:07 PM UTC

    We are continuing to work on a fix for this issue.

  4. resolved Apr 02, 2026, 03:12 PM UTC

    The Management UI SSO shortcut is back to service. Impacted Clusters: All customers with clusters created more than 6 months ago Cause: Some code changes unexpectedly changed the communication schema between our backend services and the SSO component on the clusters. The proxy configuration conflicted with those changes causing the 502 Bad Gateway error to show up. Impact: Only the SSO via CloudAMQP Console was impacted given it required communication with our backend components. The AMQP Clusters were healthy and servicing traffic via both AMQP and HTTP on Management UI and API. Logging into the Management UI via Credentials and all Authentications Backends was functioning normally.

Read the full incident report →

Notice March 25, 2026

Metrics delivery delay for 12% of the fleet

Detected by Pingoru
Mar 25, 2026, 05:45 PM UTC
Resolved
Mar 25, 2026, 06:03 PM UTC
Duration
17m
Affected: Metrics
Timeline · 2 updates
  1. identified Mar 25, 2026, 05:45 PM UTC

    We're currently experiencing issues with metrics delivery, this affects 12% of the API based v1/v2 integrations. We've added additional capacity and expect have handled the backlog within 20 minutes.

  2. resolved Mar 25, 2026, 06:03 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor March 18, 2026

Queue metrics for legacy integrations delayed

Detected by Pingoru
Mar 18, 2026, 09:25 PM UTC
Resolved
Mar 19, 2026, 03:50 AM UTC
Duration
6h 24m
Affected: Metrics
Timeline · 3 updates
  1. identified Mar 18, 2026, 09:25 PM UTC

    We noticed an issue with our queue metrics being sent to metrics integrations. there is currently a delay in processing, but we have identified an issue and expect the delay to be fixed shortly.

  2. monitoring Mar 18, 2026, 10:16 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Mar 19, 2026, 03:50 AM UTC

    This incident has been resolved.

Read the full incident report →

Notice March 3, 2026

Coralogix log delivery performance issues

Detected by Pingoru
Mar 03, 2026, 09:11 PM UTC
Resolved
Mar 04, 2026, 02:14 PM UTC
Duration
17h 3m
Timeline · 3 updates
  1. investigating Mar 03, 2026, 09:11 PM UTC

    We're currently experiencing issues with Coralogix log delivery.

  2. monitoring Mar 04, 2026, 11:16 AM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Mar 04, 2026, 02:14 PM UTC

    This incident has been resolved.

Read the full incident report →

Notice March 3, 2026

Metrics delivery outage for v3 Prometheus based metrics.

Detected by Pingoru
Mar 03, 2026, 06:28 PM UTC
Resolved
Mar 03, 2026, 08:51 PM UTC
Duration
2h 23m
Affected: Metrics
Timeline · 5 updates
  1. investigating Mar 03, 2026, 06:28 PM UTC

    We're currently experiencing issues with metrics delivery, this affects v3 (Prometheus based) metrics.

  2. identified Mar 03, 2026, 06:47 PM UTC

    The issue has been identified and a fix is being implemented.

  3. monitoring Mar 03, 2026, 07:30 PM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved Mar 03, 2026, 08:51 PM UTC

    This incident has been resolved.

  5. postmortem Mar 04, 2026, 02:14 PM UTC

    ### Summary On March 3, an internal monitoring metric stopped being reported for approximately 14 hours. No customer metrics delivery was affected, all integrations continued operating normally. The missing metric caused our dashboards to display false failure rates, which was incorrectly reported as a metrics delivery outage. ### Impact None. Metrics continued to be delivered to all customer endpoints throughout the incident. ### Resolution The monitoring metric was restored and we have updated our alerting queries to be resilient to missing data, preventing false positives in the future.

Read the full incident report →

Notice February 17, 2026

Issues with shared server "chameleon"

Detected by Pingoru
Feb 17, 2026, 07:31 PM UTC
Resolved
Feb 18, 2026, 03:30 PM UTC
Duration
19h 59m
Timeline · 3 updates
  1. investigating Feb 17, 2026, 07:31 PM UTC

    We are currently investigating this issue.

  2. monitoring Feb 17, 2026, 09:44 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Feb 18, 2026, 03:30 PM UTC

    This incident has been resolved.

Read the full incident report →

Major February 3, 2026

Azure outage

Detected by Pingoru
Feb 03, 2026, 02:53 AM UTC
Resolved
Feb 03, 2026, 08:17 AM UTC
Duration
5h 23m
Affected: Microsoft Azure
Timeline · 2 updates
  1. monitoring Feb 03, 2026, 02:53 AM UTC

    A known Azure outage is leading to an error that prevents instances being created for now.

  2. resolved Feb 03, 2026, 08:17 AM UTC

    This incident has been resolved by Azure, https://azure.status.microsoft/en-us/status

Read the full incident report →

Looking to track CloudAMQP downtime and outages?

Pingoru polls CloudAMQP's status page every 5 minutes and alerts you the moment it reports an issue — before your customers do.

  • Real-time alerts when CloudAMQP reports an incident
  • Email, Slack, Discord, Microsoft Teams, and webhook notifications
  • Track CloudAMQP alongside 5,000+ providers in one dashboard
  • Component-level filtering
  • Notification groups + maintenance calendar
Start monitoring CloudAMQP for free

5 free monitors · No credit card required