JumpCloud experienced a minor incident on June 10, 2025 affecting RADIUS, lasting 1h 8m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jun 10, 2025, 03:13 PM UTC
We're currently investigating reports of issues with JumpCloud's RADIUS-as-a-Service. We are investigating the cause of the issues.
- identified Jun 10, 2025, 03:20 PM UTC
The issue with RADIUS has been identified and we are working on a fix.
- monitoring Jun 10, 2025, 03:57 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Jun 10, 2025, 04:22 PM UTC
This incident has been resolved.
- postmortem Jun 12, 2025, 08:36 PM UTC
 # **Incident Report** **Date**: Jun 12, 2025 **Date of Incident:** Jun 10, 2025 **Description**: RCA for USE1 RADIUS Service Disruption **Summary:** On June 10th, between approximately 13:25 UTC and 15:30 UTC, following a planned operating system upgrade in the USE1 region, our RADIUS authentication service experienced an unexpected outage. Metrics indicated there was an underlying problem, and our engineering team was immediately engaged. To ensure service stability, the upgrade was rolled back, successfully restoring all services to their normal state. **Root Cause:** The outage was triggered by a critical application failure that only manifested under the high-concurrency and heavy traffic conditions of our production environment. Our post-incident analysis identified multiple contributing factors: * **Application Crash:** The direct cause was a critical memory error \(`SIGSEGV` or Segmentation Fault\) in the FreeRADIUS application, resulting in its termination with `exit code 139`. This memory violation was triggered by a suspected underlying incompatibility between the application's dependencies and the new operating system, likely stemming from differences in how the new environment handles memory management or system libraries. * **Load Sensitivity:** The incompatibility was not detected during staging and pre-production testing. Post-revert analysis revealed that the application on the upgraded OS experienced significant spikes in CPU and memory utilization. This increased resource consumption became critical only when exposed to the high volume and concurrency of production user traffic, which ultimately triggered the memory access violation. **Corrective Actions / Risk Mitigation:** 1. Immediately revert the operating system upgrade. - DONE 2. Increased alerting and monitoring at this layer - DONE 3. Update our performance testing environment to more accurately simulate the peak traffic patterns of production with more RADIUS protocol variants - IN PROGRESS 4. Deeper dependency analysis of the FreeRADIUS application and its libraries on the upgraded OS - IN PROGRESS 5. Review of our maintenance procedures ensuring traffic volume is analyzed and maintenance windows are optimal - IN PROGRESS