Shelf.io Outage History

Shelf.io had 14 outages in the last 2 years totaling 52h 57m of downtime — averaging 0.6 incidents per month.

There were 14 Shelf.io outages since June 18, 2024 totaling 52h 57m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.shelf.io

Minor November 3, 2025

Self-Service Portal Homepage Cache Issue on some US/EU Portals

Detected by Pingoru: Nov 03, 2025, 02:15 PM UTC
Resolved: Nov 03, 2025, 06:43 PM UTC
Duration: 4h 28m

Affected: Self-Service PortalsSelf-Service Portals

Timeline · 4 updates

investigating Nov 03, 2025, 04:47 PM UTC

Some customer Self-Service Portal homepages were showing other customers' homepage content. Other pages (search, article, product pages) remained functional. Workaround: avoid the homepage. Append a language code, or add a query character such as ?, to force a fresh render.
identified Nov 03, 2025, 05:07 PM UTC

Root cause narrowed to CDN cache contamination triggered by a configuration rollback. The contamination affected homepage edge cache only.
monitoring Nov 03, 2025, 05:10 PM UTC

Self-Service Portal homepages are now returning correct content after cache purges. We are monitoring edge cache consistency and user reports for recurrence.
resolved Nov 03, 2025, 06:43 PM UTC

Duration: ~2h55m. Cause: CDN cache contamination of homepage caches. Impact limited to US and EU Self-service portal homepages. All systems normal.

Read the full incident report →

Major October 20, 2025

US region: Ongoing upstream provider degradation causing intermittent errors across multiple services

Detected by Pingoru: Oct 20, 2025, 02:57 PM UTC
Resolved: Oct 21, 2025, 06:18 AM UTC
Duration: 15h 20m

Affected: IntegrationsAuthenticationViewing Content & SearchInsights & AnalyticsEmails & NotificationsSelf-Service PortalsMaking Changes to Content

Timeline · 11 updates

Read the full incident report →

Critical October 20, 2025

US region: Widespread service disruption due to upstream cloud provider incident

Detected by Pingoru: Oct 20, 2025, 06:48 AM UTC
Resolved: Oct 20, 2025, 10:14 AM UTC
Duration: 3h 26m

Affected: IntegrationsAuthenticationViewing Content & SearchInsights & AnalyticsEmails & NotificationsSelf-Service PortalsMaking Changes to Content

Timeline · 5 updates

investigating Oct 20, 2025, 07:13 AM UTC

We are investigating reports of sign‑in failures in our US region. The first errors began at approximately 06:48 UTC. EU and CA regions remain healthy. We'll share more as we learn. Next update within 30 minutes.
identified Oct 20, 2025, 07:24 AM UTC

We have identified that the sign‑in failures correlate with a regional incident at our cloud infrastructure provider in their US‑East region. Customers served from our US region may see "incorrect password" messages or slow/failed requests. Our EU and CA regions remain healthy.
identified Oct 20, 2025, 08:28 AM UTC

The issue persists. Our cloud infrastructure provider continues to experience a region-wide incident in their US‑East region. We are actively working with the provider to reduce impact.
monitoring Oct 20, 2025, 09:29 AM UTC

US region services have returned to normal operation as of 09:22 UTC. This incident stemmed from a region‑wide issue at our cloud infrastructure provider and caused intermittent errors and latency across multiple US services. Impact window: 06:48-09:22 UTC (2 hours 34 minutes). No data was lost. EU and CA regions were not affected. Thank you for your patience.
resolved Oct 20, 2025, 10:14 AM UTC

This incident has been resolved.

Read the full incident report →

Minor October 17, 2025

Search, Announcements, and Feedback API errors (US)

Detected by Pingoru: Oct 17, 2025, 03:22 PM UTC
Resolved: Oct 17, 2025, 04:03 PM UTC
Duration: 41m

Affected: Viewing Content & Search

Timeline · 4 updates

Read the full incident report →

Critical September 12, 2025

Service Availability Incident: Elevated 5xx Errors Across Regions

Detected by Pingoru: Sep 12, 2025, 04:30 PM UTC
Resolved: Sep 12, 2025, 08:00 PM UTC
Duration: 3h 29m

Affected: IntegrationsIntegrationsIntegrationsAuthenticationAuthenticationAuthenticationViewing Content & SearchViewing Content & SearchInsights & AnalyticsInsights & AnalyticsViewing Content & SearchSelf-Service PortalsSelf-Service PortalsMaking Changes to ContentMaking Changes to ContentInsights & AnalyticsSelf-Service PortalsMaking Changes to Content

Timeline · 8 updates

identified Sep 12, 2025, 05:04 PM UTC

Starting at 16:30 UTC, an infrastructure issue caused elevated 5xx errors for application and API requests. We have identified the cause and are rolling out a fix. Impact affects a subset of customers across multiple regions.
identified Sep 12, 2025, 05:14 PM UTC

We are seeing recovery in the CA and EU regions, and a fix is being rolled out to the remaining regions. A subset of customers outside those regions may still be impacted.
identified Sep 12, 2025, 05:21 PM UTC

We are continuing to work on a fix for this issue.
identified Sep 12, 2025, 05:30 PM UTC

We are continuing to work on a fix for this issue.
identified Sep 12, 2025, 05:57 PM UTC

Search has recovered across regions. Feedback and CPW remain degraded for a subset of customers while the fix completes rollout. Next update within 30 minutes.
monitoring Sep 12, 2025, 06:05 PM UTC

We’ve deployed a fix and are seeing recovery for application and API requests across regions. If you encountered errors, please try again or refresh your browser.
monitoring Sep 12, 2025, 06:10 PM UTC

All services have recovered. Error rates and latency are back to normal across all regions. We’ll continue to monitor closely for the next hour. If you still encounter issues, please refresh your browser and try again.
resolved Sep 12, 2025, 08:00 PM UTC

This incident has been resolved.

Read the full incident report →

Critical September 10, 2025

Investigating: SSO login failures across regions

Detected by Pingoru: Sep 10, 2025, 02:38 PM UTC
Resolved: Sep 10, 2025, 04:35 PM UTC
Duration: 1h 57m

Affected: AuthenticationAuthenticationAuthentication

Timeline · 3 updates

investigating Sep 10, 2025, 04:06 PM UTC

Starting at 14:38 UTC, we observed a spike in SSO authentication failures. Affected users may be unable to sign in and see "access_denied (Unauthorized)" after returning from their IdP. We are actively investigating with our identity provider. Next update within 30 minutes.
identified Sep 10, 2025, 04:26 PM UTC

We have deployed a fix to our authentication flow at 16:15 UTC. Users should begin to see successful sign-ins resume across all regions. Please retry logging in; starting a new browser session (or incognito) may help clear a failed attempt. We’re closely monitoring error rates and will provide another update within 30 minutes.
resolved Sep 10, 2025, 04:35 PM UTC

SSO sign-ins have been restored. Between 14:38-16:27 UTC a configuration regression in our authentication layer caused the authorization code exchange to fail, resulting in "access_denied (Unauthorized)" after IdP login. We corrected the configuration and refreshed services; metrics show recovery across all regions.

Read the full incident report →

Major September 2, 2025

Search and Content Viewing Degradation

Detected by Pingoru: Sep 02, 2025, 06:26 PM UTC
Resolved: Sep 02, 2025, 06:56 PM UTC
Duration: 29m

Affected: Viewing Content & SearchSelf-Service Portals

Timeline · 5 updates

investigating Sep 02, 2025, 07:16 PM UTC

We are investigating an issue where users could experience intermittent errors and delays when searching or viewing content; content creation and edits remain available.
identified Sep 02, 2025, 07:16 PM UTC

We have identified a performance issue and are working to restore normal search and content viewing.
monitoring Sep 02, 2025, 07:17 PM UTC

Mitigation is in place and performance is improving for search and content viewing; we will provide another update by 18:56 UTC.
resolved Sep 02, 2025, 07:18 PM UTC

The issue affecting search and content viewing has been resolved and service is operating normally.
postmortem Sep 02, 2025, 07:19 PM UTC

Between 18:26 - 18:56 UTC in the US region, approximately 20% of search and content viewing requests failed outright. The remaining successful requests experienced elevated latency, with p99 around 17 seconds. Content creation and edits were not affected.

Read the full incident report →

Major August 22, 2025

Degraded performance in Content service

Detected by Pingoru: Aug 22, 2025, 09:27 AM UTC
Resolved: Aug 22, 2025, 06:19 PM UTC
Duration: 8h 51m

Affected: Making Changes to Content

Timeline · 8 updates

investigating Aug 22, 2025, 09:27 AM UTC

We're currently experiencing an issue with a third-party service provider that may impact some users' experience. Our team is actively working with the provider to resolve the situation as quickly as possible.
investigating Aug 22, 2025, 09:27 AM UTC

We are continuing to investigate this issue.
investigating Aug 22, 2025, 10:42 AM UTC

We have identified that the issue is limited to gem listing updates. Specifically, some gems may continue to appear in their previous location after being moved or archived. Our team is actively working on a fix to ensure updates are reflected correctly. We will share another update as soon as we have progress to report. Thank you for your understanding.
identified Aug 22, 2025, 12:48 PM UTC

Core issue is isolated, we are working on the fix
identified Aug 22, 2025, 03:10 PM UTC

Our Team is actively working on a resolution. We will provide further updates as soon as more information becomes available. Thank you for your patience.
identified Aug 22, 2025, 03:43 PM UTC

We have begun applying fixes to address the issue. While progress has been made, the issue is not yet fully resolved. Our team continues to work toward a complete resolution, and we will share updates as they become available.
identified Aug 22, 2025, 03:50 PM UTC

Some functionalities have been successfully restored. We are continuing to work on bringing full service back online and will provide updates as progress continues.
resolved Aug 22, 2025, 06:19 PM UTC

The issue has been fully resolved and all functionalities are now restored. Thank you for your patience throughout this incident.

Read the full incident report →

Major July 3, 2025

Search Functionality Disruption

Detected by Pingoru: Jul 03, 2025, 03:53 PM UTC
Resolved: Jun 27, 2025, 12:40 PM UTC
Duration: —

Timeline · 1 update

resolved Jul 03, 2025, 03:53 PM UTC

On 27 June 2025, between 14:40 UTC and 14:51 UTC, some users in the US region were unable to retrieve search results after selecting a library. Other areas of the platform continued to operate normally.

Read the full incident report →

Critical July 3, 2025

Search Functionality Degradation Affecting Multiple Features

Detected by Pingoru: Jul 03, 2025, 03:26 PM UTC
Resolved: Jul 03, 2025, 03:46 PM UTC
Duration: 19m

Affected: Viewing Content & Search

Timeline · 2 updates

identified Jul 03, 2025, 03:48 PM UTC

The issue has been identified and a fix is being implemented.
resolved Jul 03, 2025, 03:51 PM UTC

This incident has been resolved.

Read the full incident report →

Major April 29, 2025

Gem Content Inaccessible via KMS Dashboard/Search

Detected by Pingoru: Apr 29, 2025, 03:37 PM UTC
Resolved: Apr 29, 2025, 04:03 PM UTC
Duration: 26m

Affected: Viewing Content & SearchViewing Content & SearchViewing Content & Search

Timeline · 4 updates

identified Apr 29, 2025, 04:03 PM UTC

We are investigating reports of users being unable to view Gem content.
identified Apr 29, 2025, 04:04 PM UTC

We have identified the root cause related to a deployment configuration and are deploying a correction.
resolved Apr 29, 2025, 04:05 PM UTC

The configuration issue has been corrected, and Gem content access via KMS is restored.
postmortem Apr 29, 2025, 06:36 PM UTC

Between approximately 15:37 UTC and 16:03 UTC on April 29, 2025, some users experienced an issue where content pages \('Gems'\) would not load when accessed directly from the main Shelf dashboard or search results. This temporary issue stemmed from the premature activation of a configuration change during preparations for our planned user interface update, scheduled for April 30th. **Impact & Workaround:** During this time, content remained accessible via the Gem Preview feature. Other platform areas like Agent Assist, Self-Service Portals, and Content Intelligence were unaffected. **Resolution:** The issue was identified and fully resolved by 16:03 UTC. All systems are now operating normally. We apologize for any inconvenience caused.

Read the full incident report →

Critical February 25, 2025

Content Intelligence Service Degradation in US Region

Detected by Pingoru: Feb 25, 2025, 05:20 PM UTC
Resolved: Feb 25, 2025, 08:56 PM UTC
Duration: 3h 36m

Affected: Insights & Analytics

Timeline · 3 updates

identified Feb 25, 2025, 07:53 PM UTC

We are currently experiencing a disruption to our Content Intelligence service in the US region that began at 17:20 UTC. Users may be unable to view analytics dashboards and charts. All data remains secure, and any changes made during this period will be properly processed once service is fully restored. Our engineering team has identified the root cause as an underlying infrastructure issue with our service provider and is actively implementing mitigation measures, including routing operations to redundant systems. We are in direct communication with AWS support engineers via a priority incident channel and collaboratively working toward resolution. This issue affects only the US region; EU and CA regions remain fully operational. We will provide updates as the situation progresses.
monitoring Feb 25, 2025, 08:12 PM UTC

The Content Intelligence service in the US region has been fully restored as of 20:00 UTC. Our engineering team successfully implemented a failover solution using a recent system backup taken moments before the disruption. All user data is intact and operations performed during the outage have been properly stored. Some users may experience brief periods of reduced performance as our systems complete the scaling process. We continue to monitor the service closely during this stabilization period and appreciate your patience throughout this event.
resolved Feb 25, 2025, 08:56 PM UTC

The disruption to Content Intelligence service in the US region has been fully resolved. Our engineering team successfully restored all services to their normal state. The situation has been closely monitored, and performance metrics have stabilized without further issues. We appreciate your patience and understanding during this time.

Read the full incident report →

Minor September 19, 2024

Chrome Update Causing UI Errors

Detected by Pingoru: Sep 19, 2024, 08:08 AM UTC
Resolved: Sep 19, 2024, 12:30 PM UTC
Duration: 4h 22m

Affected: Viewing Content & SearchViewing Content & SearchViewing Content & Search

Timeline · 3 updates

investigating Sep 19, 2024, 08:08 AM UTC

We are aware of an issue where a recent update to Google Chrome is causing errors within the user interface of our platform. Our team is actively investigating the matter to identify the root cause and implement a fix. We will provide further updates as soon as more information is available. Thank you for your patience and understanding.
monitoring Sep 19, 2024, 12:28 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Sep 19, 2024, 12:30 PM UTC

This incident has been resolved.

Read the full incident report →

Major June 18, 2024

Content Publication Workflow: Service Interruption

Detected by Pingoru: Jun 18, 2024, 11:30 AM UTC
Resolved: Jun 18, 2024, 04:59 PM UTC
Duration: 5h 29m

Affected: Making Changes to Content

Timeline · 3 updates

identified Jun 18, 2024, 04:57 PM UTC

We have identified an issue with the task publishing process and are actively working to resolve it.
monitoring Jun 18, 2024, 05:03 PM UTC

A fix has been implemented and we are monitoring the results.
resolved Jun 18, 2024, 05:04 PM UTC

This incident has been resolved.

Read the full incident report →