Exalate incident

Some Exalate nodes are unavailable

Exalate experienced a major incident on November 27, 2025 affecting Hosting platform, lasting 19h 21m. The incident has been resolved; the full update timeline is below.

Started: Nov 27, 2025, 11:47 AM UTC
Resolved: Nov 28, 2025, 07:08 AM UTC
Duration: 19h 21m
Detected by Pingoru: Nov 27, 2025, 11:47 AM UTC

Affected components

Hosting platform

Update timeline

investigating Nov 27, 2025, 11:47 AM UTC

We are investigating reports of some Exalate nodes being unavailable. We will be providing an update shortly on the investigation.
identified Nov 27, 2025, 12:33 PM UTC

We have identified the problem and are working to restore full service as soon as possible.
identified Nov 27, 2025, 01:47 PM UTC

Most nodes are currently running but we are still experiencing intermittent service failures. The full fix is being implemented now.
monitoring Nov 27, 2025, 03:31 PM UTC

All nodes are back online and we are closely monitoring the cluster health.
identified Nov 27, 2025, 04:51 PM UTC

We have detected some instability again and are investigating why some nodes are not reachable again.
identified Nov 27, 2025, 08:48 PM UTC

All nodes are now back online and our team is working on stabilizing the services
monitoring Nov 27, 2025, 10:04 PM UTC

All nodes are now fully operational. Any communication error reported in the console should now disappear after hitting "Resolve and retry". If you encounter any further problem please reach out to our support team. We continue to monitor the cluster closely and investigate to root cause the incident.
resolved Nov 28, 2025, 07:08 AM UTC

We have closely monitored the health of the affected Exalate nodes, and they are all performing as per expectation. A post-mortem of the incident will be published in due course.
postmortem Dec 12, 2025, 12:02 PM UTC

**Incident: Service Degradation - November 27, 2025** **Duration:** 11:26 CET - 21:37 CET \(~10 hours\) **Impact:** Approximately 700 Exalate Cloud nodes experienced intermittent connectivity issues, resulting in temporary disruption to data synchronization services. No data was lost during this incident. **Summary:** On November 27, 2025, our infrastructure monitoring detected service degradation affecting a subset of Exalate Cloud customers. The root cause was identified as storage I/O contention on the underlying database infrastructure, which caused the primary database to become unresponsive intermittently. **Timeline:** * 11:26 CET - Issue detected by monitoring systems * 11:30 CET - Engineering team began investigation * 15:30 CET - Initial mitigation applied \(infrastructure scaling\) * 18:30 CET - Root cause identified and permanent remediation initiated * 21:37 CET - Full service restoration confirmed **Root Cause:** Database storage was experiencing resource contention due to shared infrastructure components, causing elevated I/O latency. This triggered automated health checks to restart the database service repeatedly, compounding the connectivity issues. **Resolution:** The database was migrated to dedicated storage resources and allocated additional compute capacity. All affected nodes were restored to full operation with no data loss. **Preventive Measures:** * Implementing enhanced storage latency monitoring and alerting * Reviewing infrastructure health check configurations for database workloads * Auditing storage allocation across all database clusters to prevent similar contention We apologize for any inconvenience this incident may have caused. Our team remains committed to maintaining the reliability and performance of the Exalate Cloud platform.