Spruce Health incident

[RESOLVED] Cloud Provider (AWS) Outage

Major Resolved View vendor source →

Spruce Health experienced a major incident on October 20, 2025 affecting Web App and Mobile Apps and 1 more component, lasting 20h 50m. The incident has been resolved; the full update timeline is below.

Started
Oct 20, 2025, 07:41 AM UTC
Resolved
Oct 21, 2025, 04:31 AM UTC
Duration
20h 50m
Detected by Pingoru
Oct 20, 2025, 07:41 AM UTC

Affected components

Web AppMobile AppsPhone Call RoutingFaxSMS RoutingVideo CallingSpruce app notificationsIntegrationsAPI

Update timeline

  1. investigating Oct 20, 2025, 07:41 AM UTC

    Our cloud provider AWS is experiencing a major outage that is affecting the ability to log into Spruce, and for us to route phone calls. All messages received during this time will be delivered when the system recovers. We are currently investigating the full impact.

  2. investigating Oct 20, 2025, 09:06 AM UTC

    We are continuing to investigate workarounds, however our telephony infrastructure partner is also affected by the major AWS outage and as such cannot deliver phone calls or SMS to our system, so we cannot provide the typical backup system coverage that we are normally able to provide during a Spruce outage. AWS has reported that they have identified the root cause of the issue and are working on a resolution.

  3. identified Oct 20, 2025, 10:33 AM UTC

    AWS has restored most service, and our telephony partner is beginning to restore service. The Spruce inbox is now accessible and secure messaging is working. Phone and SMS continue to be impacted. At this time we still expect that SMS sent and received during the outage will be delivered after our telephony partner's service has been restored.

  4. monitoring Oct 20, 2025, 11:55 AM UTC

    We're seeing strong signs of recovery. Phone and video calls are now functional, and most users can once again send and receive SMS messages. Some delays in receiving SMS messages may still occur as our telephony partner continues their recovery efforts. Our team is closely monitoring the situation and will share additional updates as more information becomes available.

  5. monitoring Oct 20, 2025, 01:14 PM UTC

    We're seeing recovery across all systems. Our telephony partner is still reporting that there could be elevated rates of dropped calls, and that SMS may be delayed, but the majority of calls and SMS are now working normally. We will continue to provide updated information here as it becomes available.

  6. monitoring Oct 20, 2025, 02:17 PM UTC

    Our cloud provider AWS continues to experience operational issues which is delaying our recovery. Our system is currently working at partial capacity in servicing requests. Outbound fax and SMS are delayed. Inbound and outbound calls are functional without guarantees of full recovery yet. In general, you will continue to experience intermittent issues with Spruce given the limited capacity in which the platform is being serviced.

  7. monitoring Oct 20, 2025, 02:28 PM UTC

    We have enabled the Spruce backup system (second time around during this incident) as of 7:21am PT. The backup system will help notify those that have subscribed to updates of any incoming calls and texts. You can learn more about the backup system here: https://help.sprucehealth.com/hc/en-us/articles/23003270604699-Spruce-Backup-System The Spruce inbox is intermittently loading at the moment.

  8. monitoring Oct 20, 2025, 02:58 PM UTC

    Our cloud provider (AWS) is currently experiencing widespread issues that are impacting the availability and performance of the Spruce platform. We are closely monitoring the situation and working to restore full functionality. At this time, you may experience disruptions across multiple areas of Spruce, including loading the inbox, sending and receiving messages, calls, SMS, faxes, and video visits. Some features may work intermittently or with delays.

  9. monitoring Oct 20, 2025, 03:45 PM UTC

    As of 8:05 AM PT, the Spruce inbox is loading and operational. Inbound and outbound calls, as well as video visits, are currently functional. There continue to be delays in processing inbound and outbound call events, SMS messages, faxes, and emails. Please note that AWS, our cloud provider, has not yet provided an update on their recovery. As a result, platform stability cannot be guaranteed at this time. We are continuing to monitor the situation closely and will share updates as they become available.

  10. monitoring Oct 20, 2025, 05:37 PM UTC

    The Spruce inbox, as well as the ability to make and receive calls, send and receive secure messages, and conduct video visits, all remain currently functional. There continue to be significant, multi-hour delays in processing inbound and outbound call events, SMS messages, faxes, emails, and search-related updates. These delays have persisted for some time, and we are unable to increase processing capacity given the ongoing issues with our cloud provider, AWS. We are continuing to closely monitor the situation. While AWS has been providing updates, none have indicated signs of recovery. As a result, we cannot guarantee that services will remain stable until more progress is reported.

  11. monitoring Oct 20, 2025, 06:30 PM UTC

    The Spruce inbox remains operational, and you can continue to make and receive calls, exchange secure messages, and conduct video visits. We are, however, experiencing additional delays in processing certain system events due to new issues reported by our cloud provider, AWS. Another part of their infrastructure that Spruce depends on has been impacted. • Push notifications: There is a delay in processing push notifications, which may prevent patients and providers from receiving real-time alerts for new messages or updates. • Backup system events: There is a delay in processing Spruce Backup system events, resulting in delayed delivery of email and SMS notifications related to inbound SMS and calls. In relative good news, we’ve been able to fully catch up and remain up to date with inbound SMS, email, and fax processing. Our team is now working to catch up on outbound SMS, call events, fax, and email in a similar manner. AWS continues to provide updates, but none have indicated signs of recovery yet. We are continuing to closely monitor the situation and will share updates as they become available.

  12. monitoring Oct 20, 2025, 07:50 PM UTC

    Our systems are no longer experiencing delays in processing call events, outbound SMS, faxes, or emails. We have caught up on the backlog, and any previously delayed items should now appear in the inbox with a “delayed message” indicator. AWS has reported early signs of recovery, and we are also observing an increase in our system capacity as a result of prior scaling efforts. At this time, the Spruce inbox remains fully functional. Calls, secure messages, video visits, and related events are operating normally, and call events, emails, faxes, and SMS messages should now be delivered in a timely manner. There are still delays in processing push notifications and a backlog of search-related events. As a result, some users may experience delayed or missing push notifications, and search results or contact lists may appear stale or out of date. We are monitoring both areas closely to ensure that they continue to process and catch up over time. The backup system has been turned off now that there are no longer delays in processing inbound or outbound events. We will continue to closely monitor the situation to ensure the platform remains stable and fully restored.

  13. monitoring Oct 20, 2025, 09:35 PM UTC

    We are seeing continued signs of recovery from AWS. Our push notification delivery system has caught up, and the Spruce Backup System has completed processing all pending notifications. Search-related events have also caught up, and the system is now running at its intended capacity to fully process requests. We are reviewing patient and provider accounts created during the outage period to ensure that all data is intact and consistent. Our communication infrastructure provider, Twilio, which also depends on AWS, continues to work toward full recovery. As they restore their services, inbound and outbound SMS may still experience delays, and call connections may occasionally be impacted. We are closely monitoring the recovery of both AWS and Twilio to ensure that Spruce remains fully operational and stable.

  14. resolved Oct 21, 2025, 04:31 AM UTC

    AWS has confirmed full recovery of their services, and Twilio has also reported full restoration of functionality. All Spruce systems are operating normally. The inbox, calls, secure messaging, video visits, faxes, and SMS are functioning as expected, and notifications are being delivered in real time. With all systems stable and fully operational, we are considering this incident resolved. Summary: From 11:57 PM PDT on October 19 to 2:25 PM PDT on October 20, Spruce experienced platform-wide instability caused by a widespread AWS outage and a concurrent Twilio disruption. Three separate windows resulted in major outages when most Spruce services were unavailable: - 00:04 AM → 02:23 AM PDT - 07:12 AM → 08:04 AM PDT - 11:42 AM → 11:52 AM PDT During other periods, the platform was in a partial outage state, where functionality was degraded but not entirely down. The incident impacted the following areas of Spruce: - Inbox loading and conversation access - Sending and receiving secure messages, SMS, faxes, and emails - Call handling and voicemail - Video visits - Public API We will follow up with a detailed postmortem outlining root causes and mitigations.

  15. postmortem Nov 12, 2025, 09:55 PM UTC

    ### Summary Between 11:57 PM PDT on October 19 and 2:25 PM PDT on October 20, Spruce experienced a platform-wide disruption due to a major AWS outage and a concurrent Twilio issue. There were three periods of major outage when most Spruce functionality was unavailable: * 00:04 AM → 02:23 AM PDT * 07:12 AM → 08:04 AM PDT * 11:42 AM → 11:52 AM PDT Outside these windows, the platform remained partially degraded. The outage affected inbox access, secure messaging, calls, SMS, faxes, emails, video visits, and the public API. Services already running on healthy AWS instances continued to function intermittently, but scaling and communication components were heavily impacted. ### Analysis The incident stemmed from widespread failures in AWS control-plane components \(including EC2 orchestration, DynamoDB, Lambda, and ECR\), preventing new ECS tasks from launching and disrupting core Spruce services. Twilio's concurrent outage compounded the impact, delaying or blocking call and message delivery. ### Next Steps In response, we've updated our on-call notification schedule for faster follow-the-sun escalations to our non-US engineers, and introduced process improvements and clearer communication practices to streamline incident response and keep users better informed. We're also evaluating backup system independence from AWS control-plane services and improvements to routing throughput and retry handling. These steps aim to improve both resilience and response speed for future large-scale incidents.