Rebuy incident
Intermittent Sync Issues Affecting Reporting, Widgets, and Smart Cart
Rebuy experienced a major incident on April 1, 2025 affecting API and Smart Cart and 1 more component, lasting 5h 14m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Apr 01, 2025, 02:05 AM UTC
We’re currently investigating multiple reports of sync issues that are intermittently affecting reporting features, on-site widgets, and the smart cart. This issue appears to be sporadic and does not impact all clients. Our engineering team is actively working to identify and resolve the problem. We’ll provide updates as we learn more.
- investigating Apr 01, 2025, 02:18 AM UTC
We are continuing to investigate this issue.
- investigating Apr 01, 2025, 02:33 AM UTC
A change is being implemented to help alleviate the issue.
- investigating Apr 01, 2025, 02:51 AM UTC
We're beginning to see some errors resolving and signs of improvement, but the issue is not yet fully resolved. Our team continues to actively troubleshoot and assess next steps to fully address the underlying problem.
- investigating Apr 01, 2025, 03:22 AM UTC
We are continuing to see improvements and are implementing an additional change to further alleviate the issue while we continue to investigate root cause.
- investigating Apr 01, 2025, 04:01 AM UTC
Additional improvements are being observed; however, sporadic issues persist across some stores, primarily related to product synchronization. Our team is continuing to investigate the situation.
- investigating Apr 01, 2025, 04:57 AM UTC
Product images in the widgets are still exhibiting inconsistent behavior, but the carts and checkout functionality continue to be working properly. Our team is working on identifying the root cause.
- identified Apr 01, 2025, 05:50 AM UTC
The issue has been identified, and we are actively deploying a fix.
- monitoring Apr 01, 2025, 07:14 AM UTC
A fix has been implemented and we are monitoring the results.
- resolved Apr 01, 2025, 07:20 AM UTC
The issue has been resolved. Our team is preparing a formal RCA and will post it once complete.
- postmortem Apr 04, 2025, 10:24 PM UTC
**Issue:** Customers experienced degraded service due to issues with product images, links, and webhooks, impacting both the admin panel and certain on-site features such as widgets and Smart Cart. The problem was identified after a scheduled webhook health check triggered which led to processing issues. **Root Cause:** The issue stemmed from the update to our APIs to align with Shopify's required version, which included changes to webhook handling and product/variant data formats. After the update, malformed data started to appear during the webhook health check. This caused increased processing load on Bigtable, leading to service degradation across other systems. The malformed data also caused errors in the Smart Cart service and issues with product images and links in the admin panel and widgets. The data corruption was traced to conflicts between new and old webhooks attempting to process similar items at the same time. **Actions Taken:** * **Bigtable Load Reduction:** Immediate actions were taken to reduce the load on Bigtable, including manual resource updates, which helped stabilize the system temporarily. * **Smart Cart Hotfix:** The Smart Cart team released hotfixes to handle the malformed data, resolving the errors in the Smart Cart service. * **Workaround for Inventory Webhooks:** A temporary solution was implemented by ignoring webhooks related to inventory levels to prevent further processing of malformed data. * **Data Resync:** We initiated the process of resyncing all affected products to remove the malformed data and prevent further issues. **Next Steps:** We are currently working on fully resolving the webhook issue related to product inventory levels. This will allow us to rollback the workaround for inventory webhooks currently in place and help to prevent similar issues from occurring in the future. **Update:** All related tasks to the post mortem have been completed and confirmed successful.