Is Lucidworks down?
Last checked 5m agoNo incidents right now.
Lucidworks is operational right now. Last checked 5m ago; the most recent incident resolved 16d ago.
Real-time Lucidworks status, recent outages, and incident history — pulled directly from Lucidworks's official status page at https://status.lucidworks.com every 5 minutes. Pingoru tracks 7 Lucidworks services and has captured 3 incidents in the last 90 days (99.21% uptime). Get email, Slack, Discord, or webhook alerts the moment Lucidworks reports a new incident — free for 5 monitors, no credit card.
Recent outages & incidents
Past 90 days- Lucidworks AI
Timeline · 5 updates
- investigating · May 28, 2026, 05:14 PM UTC
Certain Lucidworks AI hosted models (llama-3-8b-instruct, llama-3v2-3b-instruct, and phi-4-multimodal-instruct) in the us-southcarolina region are experiencing a service disruption, causing them to be currently unavailable. Customers attempting to use these Lucidworks-hosted models are receiving 500 or 429 errors for queries that rely on them. End users utilizing services that depend on these specific models are encountering errors, though passthrough LLM calls remain fully functional. We are currently investigating the issue to determine the cause and are actively working to restore full availability. We will provide further updates as new information becomes available.
- identified · May 28, 2026, 05:39 PM UTC
The service disruption affecting certain Lucidworks AI hosted models remains ongoing, and the models are currently unavailable. The disruption occurred during routine Kubernetes maintenance involving node upgrades, during which a cloud provider capacity shortfall prevented the instances hosting these models from re-launching. We have secured the needed capacity and have engaged our cloud provider to resolve the remaining delays in bringing the models back up. We will provide further updates as additional information becomes available.
- monitoring · May 28, 2026, 06:01 PM UTC
We have successfully launched new nodes with the required LLM images, and redirected Lucidworks AI traffic to them. All LWAI hosted models are fully functional again, and stability has been confirmed through successful 200 responses from the prediction endpoint. End-user functionality is fully restored, with queries no longer returning errors. We are continuing to observe monitoring metrics to ensure system stability remains consistent before providing a final resolution update.
- resolved · May 28, 2026, 06:08 PM UTC
The service disruption affecting certain Lucidworks AI hosted models has been fully resolved, and all previously affected services are operating normally. End-user functionality is fully restored, with all dependent services operating as expected. The disruption originated during routine Kubernetes maintenance involving node upgrades, during which a cloud provider capacity shortfall prevented the instances hosting these models from re-launching. Stability was restored after successfully launching new nodes and pulling the required images. We will share a postmortem report containing the full root cause analysis within three business days.
- postmortem · Jun 02, 2026, 10:33 PM UTC
## Summary On May 28, 2026, between 16:18 UTC and 17:53 UTC, Lucidworks AI hosted LLMs were unavailable in the `us-southcarolina` region. Customers using hosted LLM inference \(`llama-3-8b-instruct`, `llama-3v2-3b-instruct`, and `phi-4-multimodal-instruct`\) received 500 or 429 errors when attempting to query these models. Other SaaS Platform services, including search, embedding models, and the Lucidworks Platform UI, were not affected. Lucidworks Engineering declared a Sev1 incident at 16:54 UTC and began remediation efforts. All hosted LLM models were fully restored and operational at 17:53 UTC. ## Root Cause The incident was caused by a routine Kubernetes patch upgrade on a cluster in the `us-southcarolina` region. LWAI-hosted models are served via [Ray Serve](https://docs.ray.io/en/latest/serve/index.html), which uses both “head” and “worker” nodes as part of its deployment system for routing inference requests. The Kubernetes upgrade cycled node pools, causing all cluster head nodes and worker nodes to restart simultaneously. Under normal conditions, the Ray cluster can tolerate a head node restart because worker nodes continue serving requests. However, the node pool upgrades utilize a surge strategy where our platform waits for pods to leave the old node \(be evicted\) but not for them to be running on the new node. The platform considers the “drain” successful once the pod is gone from the old node and moves on to the next one, even if the pod is stuck in an “Initializing” state on the new node. This meant that all node pools were cycled in rapid succession, and both head and worker pods were evicted before any had finished initializing on their replacement nodes, resulting in a complete cluster outage. Recovery was prolonged by multiple compounding factors. First, one of the replacement head nodes was in a degraded state and unable to pull container images, requiring manual intervention to delete the node. Second, the LLM container images \(6-11 GB in size\) experienced abnormally slow Docker image transfer, taking 30-52 minutes compared to the typical 2-4 minutes observed in normal operation. Additionally, in a separate operation, new models were being brought online to expand our LWAI offering, and this caused the Ray operator's blue-green deployment strategy to require the existing _and_ replacement LLM deployments to be healthy before switching traffic, which extended the outage until the slower image pulls completed on both blue and green deployments. Lucidworks Engineering deleted the degraded node, waited for image pulls to complete on replacement nodes, and verified that all hosted models were responding to queries. The incident was verified as resolved at 18:07 UTC. ## Lucidworks Actions Lucidworks will take the following actions as a result of this incident: * Implement sequenced Kubernetes upgrade procedures for clusters hosting LLM workloads, ensuring each node pool is fully healthy before the next pool is upgraded. * Investigate Docker image pre-loading strategies \(such as pre-baked disks or image streaming\) to eliminate long container image pull times for large ML model images. * Open a support ticket with our cloud provider to investigate the abnormal Docker pull times. * Establish a notification protocol to coordinate Kubernetes maintenance windows with LLM service owners to avoid conflicts with ongoing deployments. * Improve tooling around our Ray clusters to allow Lucidworks Engineering to force a failover to a blue or green state instead of waiting for Ray to automatically resolve the new and old deployments. ## Recommended Client Actions Lucidworks recommends that clients subscribe to Lucidworks status updates to receive real-time notifications about Lucidworks SaaS Platform incidents. To enable this feature, click **Subscribe to Updates** at [status.lucidworks.com](http://status.lucidworks.com).
Latest: ## Summary On May 28, 2026, between 16:18 UTC and 17:53 UTC, Lucidworks AI hosted LLMs were unavailable in the `us-southcarolina` region. Customers using hosted LLM inference \(`ll…
-
- Lucidworks Platform
Timeline · 5 updates
- investigating · May 12, 2026, 11:04 PM UTC
We are currently investigating an issue affecting the Lucidworks Platform user interface. While services hosted on the Platform remain operational and the Beacon continues to receive signals from client sites, all SaaS users are unable to access the SaaS Platform interface. Users attempting to load platform.lucidworks.com encounter an error stating that applications failed to load and menu items may be unavailable. There is no direct impact to the client’s end-user experience at this time. We are currently working to identify the cause of this disruption and will provide further updates as more information becomes available.
- identified · May 12, 2026, 11:23 PM UTC
Lucidworks has identified that a recent configuration update resulted in the removal of a critical data component within our internal database. This discrepancy is preventing the SaaS Platform interface from retrieving the information required to load correctly, leading to the access errors currently being experienced by users. While services hosted on the Platform remain operational and Lucidworks AI and Beacon services continue to function normally, all SaaS users remain unable to access the SaaS Platform interface. We are currently implementing a database update to restore the missing components and are repopulating the necessary data from a secure backup to ensure all settings are accurately recovered. Users should refrain from attempting to reconfigure platform settings or applications until access is fully restored. We will provide a further update as soon as more information is available.
- monitoring · May 12, 2026, 11:55 PM UTC
Engineering has successfully deployed a fix to restore the missing data and has been merged into the production environment. We are now observing a period of stability following these mitigation actions. The SaaS Platform interface is now accessible, and the previously reported errors have been cleared. Lucidworks AI and Beacon services remain stable and were unaffected throughout the event. No widespread user impact is currently being observed, though we are performing final data validation to ensure all configuration values have been fully restored across the environment. We are continuing to monitor platform stability and system logs to confirm that all services maintain normal performance levels. We will provide a final update once the monitoring period is complete and the incident is fully resolved.
- resolved · May 13, 2026, 12:01 AM UTC
Lucidworks has confirmed the full resolution of this incident. The SaaS Platform interface is once again accessible, and the errors previously encountered at platform.lucidworks.com have been cleared. The Lucidworks Platform is operating normally for all users. Lucidworks AI and Beacon services remained stable and were not affected throughout this event. There is no lingering impact to platform users or end-user experiences. A postmortem report will be shared once the internal review is complete.
- postmortem · May 20, 2026, 09:05 PM UTC
## Summary On May 12, 2026, at 21:19 UTC, the Lucidworks SaaS Platform experienced an access disruption. Users encountered errors when attempting to use the Lucidworks Platform UI \([platform.lucidworks.com](http://platform.lucidworks.com)\). Other SaaS Platform services, including all search and Lucidworks AI functionality, were not affected. Lucidworks Engineering identified the issue at 21:25 UTC and began remediation efforts. The platform was fully restored and operational at 23:41 UTC. ## Root Cause The incident was caused by a database schema change that was included alongside an application code update. The Lucidworks Platform runs multiple instances of each application simultaneously, and new code is rolled out incrementally. New application instances are brought online while existing instances continue serving traffic. In this case, the database schema was updated before all application instances had received the new code, causing existing instances to query a column that no longer existed.The schema change removed a database column before all running instances of the application had been updated to use the new schema. During the rollout window, existing application instances attempted to query the removed column, resulting in errors that prevented users from accessing the platform. Lucidworks Engineering deployed a fix that restored the affected database column and repopulated the necessary data from the most recently-taken database backup. The incident was verified as resolved at 23:41 UTC. ## Lucidworks Actions Lucidworks will take the following actions as a result of this incident: * Enforce phased rollout procedures for database schema changes, ensuring destructive modifications are deployed separately from application code updates to maintain backward compatibility through the rollout window * Update automated code review tooling to detect and flag destructive database operations, adding an additional layer of defense during the review process * Evaluate automated incident creation from critical alerts to reduce the time between detection and formal incident response The Lucidworks engineering team is committed to ensuring this type of incident does not recur. These enhancements will strengthen the platform's resilience and reliability for all customers. ## Recommended Client Actions Lucidworks recommends that clients subscribe to Lucidworks status updates to receive real-time notifications about Lucidworks SaaS Platform incidents. To enable this feature, click **Subscribe to Updates** at [status.lucidworks.com](http://status.lucidworks.com).
Latest: ## Summary On May 12, 2026, at 21:19 UTC, the Lucidworks SaaS Platform experienced an access disruption. Users encountered errors when attempting to use the Lucidworks Platform UI …
-
-
Timeline · 2 updates
- resolved · Apr 24, 2026, 04:34 PM UTC
The Lucidworks SaaS Platform experienced an issue that caused a total disruption of service routing, rendering the platform inaccessible. Users were unable to log in or manage any platform configurations during this period. End users were unable to access search results or interact with any deployed services. The incident was caused by an unintended deployment of a configuration change at 4:06 UTC that should not have gone to production. We resolved the issue by performing a full rollback of the faulty change, which was completed at 4:13 UTC. A postmortem report will be posted with the full incident details.
- postmortem · May 18, 2026, 10:56 PM UTC
## Summary On April 24, 2026, at 16:05 UTC, the Lucidworks SaaS Platform experienced an issue that made the entire platform temporarily unavailable and all requests returned IO errors. Lucidworks Engineering was made aware of the issue and reverted the change on April 24, 2026, at 16:08 UTC. The reversion took a few minutes to propagate through the system and successful responses were restored by April 24, 2026, at 16:13 UTC. ## Root Cause The incident was caused by a configuration change to the SaaS Platform gateway components that was deployed to production at 16:05 UTC on April 24, 2026. This change introduced an error in the configuration template that caused it to render syntactically invalid output in the production environment. When the deployment system processed this invalid configuration, it interpreted the malformed output as an indication that the gateway components \(routing and load-balancing infrastructure\) were no longer needed and removed these components from the platform. This caused all incoming requests to fail with IO errors as there were no gateway instances available to handle traffic. Lucidworks engineers were actively monitoring this deployment as it rolled out, and immediately noticed the problem that the deployment introduced. The configuration change was quickly reverted at 16:08 UTC. The deployment infrastructure then correctly reinstated the gateway components, which became fully operational by 16:13 UTC and restored normal platform operation. ## Lucidworks Actions Lucidworks has taken the following actions as a result of this incident: * Added pre-deployment validation to verify that configuration templates render syntactically valid output before deployment, and that changes of this nature are deployed to a development environment before deploying the changes to production * Modified the deployment workflow to require additional verification steps before pull requests can be merged ## Recommended Client Actions Lucidworks recommends that clients subscribe to Lucidworks status updates to receive real-time notifications about Lucidworks SaaS Platform incidents. To enable this feature, click **Subscribe to Updates** at [status.lucidworks.com](http://status.lucidworks.com).
Latest: ## Summary On April 24, 2026, at 16:05 UTC, the Lucidworks SaaS Platform experienced an issue that made the entire platform temporarily unavailable and all requests returned IO err…
-
- Started May 28, 2026, 05:14 PM UTC · Resolved May 28, 2026, 06:08 PM UTC · 53m
- SaaS Platform Access Disruption ResolvedStarted May 12, 2026, 09:19 PM UTC · Resolved May 13, 2026, 12:01 AM UTC · 2h 42m
- Started Apr 24, 2026, 04:34 PM UTC · Resolved Apr 24, 2026, 04:06 PM UTC · —