Box incident

[Outage] Issues with Box Services

Critical Resolved View vendor source →

Box experienced a critical incident on September 4, 2025 affecting Content API and Login/SSO and 1 more component, lasting 1h 40m. The incident has been resolved; the full update timeline is below.

Started
Sep 04, 2025, 06:20 PM UTC
Resolved
Sep 04, 2025, 08:00 PM UTC
Duration
1h 40m
Detected by Pingoru
Sep 04, 2025, 06:20 PM UTC

Affected components

Content APILogin/SSOLogin/SSOLogin/SSOMicrosoft OnlineCase Submission WorkflowWeb ApplicationAustraliaContent PreviewBox Sync

Update timeline

  1. investigating Sep 04, 2025, 06:20 PM UTC

    We are investigating an ongoing issue affecting Box. Users may experience errors or long load times using Box services. We will provide more information as soon as it is available.

  2. investigating Sep 04, 2025, 06:23 PM UTC

    We are continuing to investigate this issue.

  3. investigating Sep 04, 2025, 06:28 PM UTC

    We are continuing to investigate this issue.

  4. identified Sep 04, 2025, 06:33 PM UTC

    The issue has been identified and a fix is being implemented.

  5. monitoring Sep 04, 2025, 06:45 PM UTC

    A fix has been implemented and we are monitoring the results.

  6. monitoring Sep 04, 2025, 07:12 PM UTC

    Box services have recovered. We're continuing to monitor and ensure full restoration.

  7. resolved Sep 04, 2025, 08:00 PM UTC

    After further monitoring, this incident is now considered resolved. All services have been restored to full functionality. If you continue to experience any issues, please contact Box Support at https://support.box.com.

  8. postmortem Sep 05, 2025, 10:42 PM UTC

    We recently addressed issues affecting Box. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future. On September 4, 2025 between 11:03 AM and 11:24 AM PDT, Box was unavailable to users. The outage occurred when maintenance performed on our caching tier triggered a latent bug. We were able to remediate the issue by manually restoring the caching tier to its pre-maintenance state. After the initial issue was resolved, between 11:24 AM and 01:16 PM PDT, a limited number of users may have continued to experience slowness when interacting with some Box features. ‌ **Analysis** The issue was caused during a maintenance event on our caching tier to remove a dependent component. Although the change had been successfully tested in pre-production, a configuration difference between environments meant the error path was never exercised and the issue went undetected. When the dependent component was removed in production during the maintenance, client instances encountered errors and were unable to correctly handle incoming requests. Restoring the dependent component to its previous state allowed normal operations to resume. A limited number of client instances remained degraded and were recreated to recover full performance. ‌ **Corrective Actions** Box has initiated the following corrective actions: * Improve error handling within our caching client instances to reduce the impact of caching tier failures. * Add observability for configuration differences between our testing and production environments. * Enhance deployment procedure to deploy caching tier changes more incrementally. * Shorten the time taken to restore the caching tier when issues occur. ‌ We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter. Sincerely, The Box Team