Shelf.io Outage History

Shelf.io is up right now

Shelf.io had 14 outages in the last 2 years totaling 52h 57m of downtime — averaging 0.6 incidents per month.

There were 14 Shelf.io outages since June 18, 2024 totaling 52h 57m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.shelf.io

Minor November 3, 2025

Self-Service Portal Homepage Cache Issue on some US/EU Portals

Detected by Pingoru
Nov 03, 2025, 02:15 PM UTC
Resolved
Nov 03, 2025, 06:43 PM UTC
Duration
4h 28m
Affected: Self-Service PortalsSelf-Service Portals
Timeline · 4 updates
  1. investigating Nov 03, 2025, 04:47 PM UTC

    Some customer Self-Service Portal homepages were showing other customers' homepage content. Other pages (search, article, product pages) remained functional. Workaround: avoid the homepage. Append a language code, or add a query character such as ?, to force a fresh render.

  2. identified Nov 03, 2025, 05:07 PM UTC

    Root cause narrowed to CDN cache contamination triggered by a configuration rollback. The contamination affected homepage edge cache only.

  3. monitoring Nov 03, 2025, 05:10 PM UTC

    Self-Service Portal homepages are now returning correct content after cache purges. We are monitoring edge cache consistency and user reports for recurrence.

  4. resolved Nov 03, 2025, 06:43 PM UTC

    Duration: ~2h55m. Cause: CDN cache contamination of homepage caches. Impact limited to US and EU Self-service portal homepages. All systems normal.

Read the full incident report →

Critical October 20, 2025

US region: Widespread service disruption due to upstream cloud provider incident

Detected by Pingoru
Oct 20, 2025, 06:48 AM UTC
Resolved
Oct 20, 2025, 10:14 AM UTC
Duration
3h 26m
Affected: IntegrationsAuthenticationViewing Content & SearchInsights & AnalyticsEmails & NotificationsSelf-Service PortalsMaking Changes to Content
Timeline · 5 updates
  1. investigating Oct 20, 2025, 07:13 AM UTC

    We are investigating reports of sign‑in failures in our US region. The first errors began at approximately 06:48 UTC. EU and CA regions remain healthy. We'll share more as we learn. Next update within 30 minutes.

  2. identified Oct 20, 2025, 07:24 AM UTC

    We have identified that the sign‑in failures correlate with a regional incident at our cloud infrastructure provider in their US‑East region. Customers served from our US region may see "incorrect password" messages or slow/failed requests. Our EU and CA regions remain healthy.

  3. identified Oct 20, 2025, 08:28 AM UTC

    The issue persists. Our cloud infrastructure provider continues to experience a region-wide incident in their US‑East region. We are actively working with the provider to reduce impact.

  4. monitoring Oct 20, 2025, 09:29 AM UTC

    US region services have returned to normal operation as of 09:22 UTC. This incident stemmed from a region‑wide issue at our cloud infrastructure provider and caused intermittent errors and latency across multiple US services. Impact window: 06:48-09:22 UTC (2 hours 34 minutes). No data was lost. EU and CA regions were not affected. Thank you for your patience.

  5. resolved Oct 20, 2025, 10:14 AM UTC

    This incident has been resolved.

Read the full incident report →

Critical September 12, 2025

Service Availability Incident: Elevated 5xx Errors Across Regions

Detected by Pingoru
Sep 12, 2025, 04:30 PM UTC
Resolved
Sep 12, 2025, 08:00 PM UTC
Duration
3h 29m
Affected: IntegrationsIntegrationsIntegrationsAuthenticationAuthenticationAuthenticationViewing Content & SearchViewing Content & SearchInsights & AnalyticsInsights & AnalyticsViewing Content & SearchSelf-Service PortalsSelf-Service PortalsMaking Changes to ContentMaking Changes to ContentInsights & AnalyticsSelf-Service PortalsMaking Changes to Content
Timeline · 8 updates
  1. identified Sep 12, 2025, 05:04 PM UTC

    Starting at 16:30 UTC, an infrastructure issue caused elevated 5xx errors for application and API requests. We have identified the cause and are rolling out a fix. Impact affects a subset of customers across multiple regions.

  2. identified Sep 12, 2025, 05:14 PM UTC

    We are seeing recovery in the CA and EU regions, and a fix is being rolled out to the remaining regions. A subset of customers outside those regions may still be impacted.

  3. identified Sep 12, 2025, 05:21 PM UTC

    We are continuing to work on a fix for this issue.

  4. identified Sep 12, 2025, 05:30 PM UTC

    We are continuing to work on a fix for this issue.

  5. identified Sep 12, 2025, 05:57 PM UTC

    Search has recovered across regions. Feedback and CPW remain degraded for a subset of customers while the fix completes rollout. Next update within 30 minutes.

  6. monitoring Sep 12, 2025, 06:05 PM UTC

    We’ve deployed a fix and are seeing recovery for application and API requests across regions. If you encountered errors, please try again or refresh your browser.

  7. monitoring Sep 12, 2025, 06:10 PM UTC

    All services have recovered. Error rates and latency are back to normal across all regions. We’ll continue to monitor closely for the next hour. If you still encounter issues, please refresh your browser and try again.

  8. resolved Sep 12, 2025, 08:00 PM UTC

    This incident has been resolved.

Read the full incident report →

Critical September 10, 2025

Investigating: SSO login failures across regions

Detected by Pingoru
Sep 10, 2025, 02:38 PM UTC
Resolved
Sep 10, 2025, 04:35 PM UTC
Duration
1h 57m
Affected: AuthenticationAuthenticationAuthentication
Timeline · 3 updates
  1. investigating Sep 10, 2025, 04:06 PM UTC

    Starting at 14:38 UTC, we observed a spike in SSO authentication failures. Affected users may be unable to sign in and see "access_denied (Unauthorized)" after returning from their IdP. We are actively investigating with our identity provider. Next update within 30 minutes.

  2. identified Sep 10, 2025, 04:26 PM UTC

    We have deployed a fix to our authentication flow at 16:15 UTC. Users should begin to see successful sign-ins resume across all regions. Please retry logging in; starting a new browser session (or incognito) may help clear a failed attempt. We’re closely monitoring error rates and will provide another update within 30 minutes.

  3. resolved Sep 10, 2025, 04:35 PM UTC

    SSO sign-ins have been restored. Between 14:38-16:27 UTC a configuration regression in our authentication layer caused the authorization code exchange to fail, resulting in "access_denied (Unauthorized)" after IdP login. We corrected the configuration and refreshed services; metrics show recovery across all regions.

Read the full incident report →

Major September 2, 2025

Search and Content Viewing Degradation

Detected by Pingoru
Sep 02, 2025, 06:26 PM UTC
Resolved
Sep 02, 2025, 06:56 PM UTC
Duration
29m
Affected: Viewing Content & SearchSelf-Service Portals
Timeline · 5 updates
  1. investigating Sep 02, 2025, 07:16 PM UTC

    We are investigating an issue where users could experience intermittent errors and delays when searching or viewing content; content creation and edits remain available.

  2. identified Sep 02, 2025, 07:16 PM UTC

    We have identified a performance issue and are working to restore normal search and content viewing.

  3. monitoring Sep 02, 2025, 07:17 PM UTC

    Mitigation is in place and performance is improving for search and content viewing; we will provide another update by 18:56 UTC.

  4. resolved Sep 02, 2025, 07:18 PM UTC

    The issue affecting search and content viewing has been resolved and service is operating normally.

  5. postmortem Sep 02, 2025, 07:19 PM UTC

    Between 18:26 - 18:56 UTC in the US region, approximately 20% of search and content viewing requests failed outright. The remaining successful requests experienced elevated latency, with p99 around 17 seconds. Content creation and edits were not affected.

Read the full incident report →

Major August 22, 2025

Degraded performance in Content service

Detected by Pingoru
Aug 22, 2025, 09:27 AM UTC
Resolved
Aug 22, 2025, 06:19 PM UTC
Duration
8h 51m
Affected: Making Changes to Content
Timeline · 8 updates
  1. investigating Aug 22, 2025, 09:27 AM UTC

    We're currently experiencing an issue with a third-party service provider that may impact some users' experience. Our team is actively working with the provider to resolve the situation as quickly as possible.

  2. investigating Aug 22, 2025, 09:27 AM UTC

    We are continuing to investigate this issue.

  3. investigating Aug 22, 2025, 10:42 AM UTC

    We have identified that the issue is limited to gem listing updates. Specifically, some gems may continue to appear in their previous location after being moved or archived. Our team is actively working on a fix to ensure updates are reflected correctly. We will share another update as soon as we have progress to report. Thank you for your understanding.

  4. identified Aug 22, 2025, 12:48 PM UTC

    Core issue is isolated, we are working on the fix

  5. identified Aug 22, 2025, 03:10 PM UTC

    Our Team is actively working on a resolution. We will provide further updates as soon as more information becomes available. Thank you for your patience.

  6. identified Aug 22, 2025, 03:43 PM UTC

    We have begun applying fixes to address the issue. While progress has been made, the issue is not yet fully resolved. Our team continues to work toward a complete resolution, and we will share updates as they become available.

  7. identified Aug 22, 2025, 03:50 PM UTC

    Some functionalities have been successfully restored. We are continuing to work on bringing full service back online and will provide updates as progress continues.

  8. resolved Aug 22, 2025, 06:19 PM UTC

    The issue has been fully resolved and all functionalities are now restored. Thank you for your patience throughout this incident.

Read the full incident report →

Major July 3, 2025

Search Functionality Disruption

Detected by Pingoru
Jul 03, 2025, 03:53 PM UTC
Resolved
Jun 27, 2025, 12:40 PM UTC
Duration
Timeline · 1 update
  1. resolved Jul 03, 2025, 03:53 PM UTC

    On 27 June 2025, between 14:40 UTC and 14:51 UTC, some users in the US region were unable to retrieve search results after selecting a library. Other areas of the platform continued to operate normally.

Read the full incident report →

Major April 29, 2025

Gem Content Inaccessible via KMS Dashboard/Search

Detected by Pingoru
Apr 29, 2025, 03:37 PM UTC
Resolved
Apr 29, 2025, 04:03 PM UTC
Duration
26m
Affected: Viewing Content & SearchViewing Content & SearchViewing Content & Search
Timeline · 4 updates
  1. identified Apr 29, 2025, 04:03 PM UTC

    We are investigating reports of users being unable to view Gem content.

  2. identified Apr 29, 2025, 04:04 PM UTC

    We have identified the root cause related to a deployment configuration and are deploying a correction.

  3. resolved Apr 29, 2025, 04:05 PM UTC

    The configuration issue has been corrected, and Gem content access via KMS is restored.

  4. postmortem Apr 29, 2025, 06:36 PM UTC

    Between approximately 15:37 UTC and 16:03 UTC on April 29, 2025, some users experienced an issue where content pages \('Gems'\) would not load when accessed directly from the main Shelf dashboard or search results. This temporary issue stemmed from the premature activation of a configuration change during preparations for our planned user interface update, scheduled for April 30th. **Impact & Workaround:** During this time, content remained accessible via the Gem Preview feature. Other platform areas like Agent Assist, Self-Service Portals, and Content Intelligence were unaffected. **Resolution:** The issue was identified and fully resolved by 16:03 UTC. All systems are now operating normally. We apologize for any inconvenience caused.

Read the full incident report →

Critical February 25, 2025

Content Intelligence Service Degradation in US Region

Detected by Pingoru
Feb 25, 2025, 05:20 PM UTC
Resolved
Feb 25, 2025, 08:56 PM UTC
Duration
3h 36m
Affected: Insights & Analytics
Timeline · 3 updates
  1. identified Feb 25, 2025, 07:53 PM UTC

    We are currently experiencing a disruption to our Content Intelligence service in the US region that began at 17:20 UTC. Users may be unable to view analytics dashboards and charts. All data remains secure, and any changes made during this period will be properly processed once service is fully restored. Our engineering team has identified the root cause as an underlying infrastructure issue with our service provider and is actively implementing mitigation measures, including routing operations to redundant systems. We are in direct communication with AWS support engineers via a priority incident channel and collaboratively working toward resolution. This issue affects only the US region; EU and CA regions remain fully operational. We will provide updates as the situation progresses.

  2. monitoring Feb 25, 2025, 08:12 PM UTC

    The Content Intelligence service in the US region has been fully restored as of 20:00 UTC. Our engineering team successfully implemented a failover solution using a recent system backup taken moments before the disruption. All user data is intact and operations performed during the outage have been properly stored. Some users may experience brief periods of reduced performance as our systems complete the scaling process. We continue to monitor the service closely during this stabilization period and appreciate your patience throughout this event.

  3. resolved Feb 25, 2025, 08:56 PM UTC

    The disruption to Content Intelligence service in the US region has been fully resolved. Our engineering team successfully restored all services to their normal state. The situation has been closely monitored, and performance metrics have stabilized without further issues. We appreciate your patience and understanding during this time.

Read the full incident report →

Minor September 19, 2024

Chrome Update Causing UI Errors

Detected by Pingoru
Sep 19, 2024, 08:08 AM UTC
Resolved
Sep 19, 2024, 12:30 PM UTC
Duration
4h 22m
Affected: Viewing Content & SearchViewing Content & SearchViewing Content & Search
Timeline · 3 updates
  1. investigating Sep 19, 2024, 08:08 AM UTC

    We are aware of an issue where a recent update to Google Chrome is causing errors within the user interface of our platform. Our team is actively investigating the matter to identify the root cause and implement a fix. We will provide further updates as soon as more information is available. Thank you for your patience and understanding.

  2. monitoring Sep 19, 2024, 12:28 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Sep 19, 2024, 12:30 PM UTC

    This incident has been resolved.

Read the full incident report →

Major June 18, 2024

Content Publication Workflow: Service Interruption

Detected by Pingoru
Jun 18, 2024, 11:30 AM UTC
Resolved
Jun 18, 2024, 04:59 PM UTC
Duration
5h 29m
Affected: Making Changes to Content
Timeline · 3 updates
  1. identified Jun 18, 2024, 04:57 PM UTC

    We have identified an issue with the task publishing process and are actively working to resolve it.

  2. monitoring Jun 18, 2024, 05:03 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Jun 18, 2024, 05:04 PM UTC

    This incident has been resolved.

Read the full incident report →