Neo4j Aura incident

AWS US-East-1 Region Experiencing Disruption of some Aura DB Pro Operations

Neo4j Aura experienced a minor incident on May 27, 2024 affecting AuraDB Professional on AWS (*.databases.neo4j.io), lasting 17h 11m. The incident has been resolved; the full update timeline is below.

Started: May 27, 2024, 06:10 PM UTC
Resolved: May 28, 2024, 11:21 AM UTC
Duration: 17h 11m
Detected by Pingoru: May 27, 2024, 06:10 PM UTC

Affected components

AuraDB Professional on AWS (*.databases.neo4j.io)

Update timeline

investigating May 27, 2024, 06:10 PM UTC

We have identified an issue with Aura DB Pro in the US East 1 region on AWS that is preventing the following operations: Create, Resize, Clone, Backup, Load For the time being, please create new instances in other Aura DB Pro regions, and hold off on any of the above operations you can until this incident is resolved.
investigating May 28, 2024, 01:21 AM UTC

We have identified an issue with Aura DB Pro in the US East 1 region on AWS that is preventing the following operations: Create, Resize, Clone, Backup, Load For the time being, please create new instances in other Aura DB Pro regions and hold off on any of the above operations until this incident is resolved.
investigating May 28, 2024, 07:19 AM UTC

We have identified an issue with Aura DB Pro in the US East 1 region on AWS that is preventing the following operations: Create, Resize, Clone, Backup, Load For the time being, please create new instances in other Aura DB Pro regions and hold off on any of the above operations until this incident is resolved.
resolved May 28, 2024, 11:21 AM UTC

We have rolled out a fix and full functionality has been restored.
postmortem May 30, 2024, 10:35 PM UTC

### **What happened** On 2024-05-27T16:09 the AuraDB Professional tier environment in the us-east-1 region of AWS was affected by a lack of network resources \(addresses on a subnet\). This issue occurred due to a suboptimal management of this resource and came to light on this region with high volumes of instances. A default setting was pre-allocating too many addresses and exhausted the pool. Operations \(Create, Resize, Clone, Backup, Load\) on the instances that need to allocate computing resources couldn’t start for lack of network access. ### **How the service was affected** During the incident \(between 2024-05-27 at 16:09 and 2024-05-28 at 10:29 \), customer operations \(Create, Resize, Clone, Backup, Load\) were either not starting or not able to complete successfully. Other regions remained unaffected. ### **What we are doing now** * We have improved the network resource management to ensure we more optimally allocate network addresses within an environment. * We are creating an alarm to monitor across our environments our network address pools and detect any resource exhaustion. * We are planning on making better use of zones for AuraDB Professional tiers to benefit from more network resources.