Shelf.io incident

Search, Announcements, and Feedback API errors (US)

Shelf.io experienced a minor incident on October 17, 2025 affecting Viewing Content & Search, lasting 41m. The incident has been resolved; the full update timeline is below.

Started: Oct 17, 2025, 03:22 PM UTC
Resolved: Oct 17, 2025, 04:03 PM UTC
Duration: 41m
Detected by Pingoru: Oct 17, 2025, 03:22 PM UTC

Affected components

Viewing Content & Search

Update timeline

investigating Oct 17, 2025, 03:34 PM UTC

We are currently investigating increased error rates affecting search and content discovery in our US region. Some users may experience slower response times or intermittent errors when searching for content, viewing announcements, or submitting feedback. Other platform capabilities, including content creation and editing, remain fully operational.
monitoring Oct 17, 2025, 04:03 PM UTC

We have identified and addressed the underlying issue affecting search and content discovery performance in the US region. Our engineering team has implemented corrective measures and is actively monitoring system behavior. During peak impact, approximately 20-40% of search and content discovery requests experienced errors. We continue to observe system performance to ensure full stability.
resolved Oct 19, 2025, 06:29 PM UTC

This incident has been resolved. All search, announcement, and feedback services in the US region have returned to normal operation. System monitoring confirms stable performance with no residual errors. We apologize for any disruption this may have caused.
postmortem Oct 19, 2025, 07:47 PM UTC

On October 17, 2025, between 15:22 and 16:03 UTC, users in our US region experienced intermittent errors when searching for content, viewing announcements, and submitting feedback. At the peak of the incident, approximately 40% of requests to these services returned errors, while other platform capabilities remained fully operational. This event occurred during an ongoing maintenance process to upgrade our search relevance engine - work aimed at delivering more accurate, higher‑quality results to users. We ran this upgrade path in the background for several days at minimal load to avoid disruption. Near the end of the week, a configuration error caused the process to place more load on the system than intended. Coupled with a spike in search activity, this created temporary capacity pressure and intermittent errors in the US region. The outage was partial: error rates varied over the window and peaked around 40%, with the majority of requests continuing to succeed. We've already applied lessons from a prior search incident and rolled out the upgraded search infrastructure in the EU and CA regions; the US region is scheduled next. We'll continue with this upgrade, and will better communicate maintenance windows in advance, align change timing to lower‑traffic periods, and add redundancy specific to search operations to further protect performance during these necessary maintenance activities.