Neo4j Aura experienced a minor incident on March 31, 2025 affecting AuraDB Virtual Dedicated Cloud on AWS (*.databases.neo4j.io) and AuraDS Enterprise on AWS (*.databases.neo4j.io) and 1 more component, lasting 2h 21m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Mar 31, 2025, 01:15 PM UTC
We are currently investigating an issue which impact the CMI feature, where requests return HTTP 401.
- monitoring Mar 31, 2025, 02:08 PM UTC
A fix has been implemented and the CMI feature is responding normally. We will monitor for a short time before considering this resolved.
- resolved Mar 31, 2025, 03:36 PM UTC
CMI functionality has been fully restored and this issue is considered resolved.
- postmortem Apr 10, 2025, 09:38 PM UTC
### What happened On March 31, 2025, we experienced an incident that temporarily affected the availability of the Customer Metrics Integration service for Aura instances. This issue arose due to a change in our internal systems that inadvertently caused errors in processing customer requests. As a result, customers received error messages instead of the expected metrics. Our team quickly identified and resolved the problem. By rolling back the recent changes, we restored service functionality within 2.5 hours. We are committed to improving our processes to prevent similar issues in the future and ensure a seamless experience for our customers. ### Root Cause We encountered an issue due to the way our services interact with each other. One of the Aura Console API response fields structure changed and resulted in some unexpected dependency problems. We're actively working to improve the reliability of our services to ensure a smoother experience for you in the future. ### Customer Impact * Customers experienced service disruptions as they were unable to access metrics for their Aura instances. * All customer requests returned an HTTP 401 error, preventing users from retrieving necessary information. ### Resolution We rolled back the recent changes to restore service functionality quickly. Following that, implemented some changes in the Console API to reduce dependencies, and redeployed changes successfully. **What we are doing now** Neo4j remains committed to providing reliable service and is implementing additional safeguards to prevent similar incidents in the future. New mitigations being deployed: * Enhancing our incident response playbook to improve handling of similar situations. * Establishing a standardized internal Console API with clear versioning and documentation to prevent compatibility issues. * Implementing comprehensive testing measures to identify potential issues before they affect our production environment.