imgix incident

Missing Images in Asset Manager

Minor Resolved View vendor source →

imgix experienced a minor incident on March 17, 2025 affecting Web Administration Tools, lasting 1d 13h. The incident has been resolved; the full update timeline is below.

Started
Mar 17, 2025, 05:48 AM UTC
Resolved
Mar 18, 2025, 06:56 PM UTC
Duration
1d 13h
Detected by Pingoru
Mar 17, 2025, 05:48 AM UTC

Affected components

Web Administration Tools

Update timeline

  1. identified Mar 17, 2025, 05:48 AM UTC

    We have identified an issue affecting the Imgix Asset Manager. Users may experience difficulties searching for previously indexed assets and delays in indexing new assets. A fix is currently in progress. The rendering service remains unaffected.

  2. resolved Mar 18, 2025, 06:56 PM UTC

    The root problem of the incident has been resolved. Some assets may still return an error. Please get in touch with [email protected] if you are experiencing issues

  3. postmortem Mar 31, 2025, 05:12 PM UTC

    # What happened Between **March 17th, 5:48 AM UTC** and **March 18th, 6:56 PM UTC**, the Asset Manager experienced an indexing outage. During this period, newly uploaded assets and a very small percentage of pre-existing assets were not appearing in the Asset Manager UI. These assets were still successfully uploaded to the Origin and were accessible via the Rendering API. # How it happened An unannounced maintenance from our upstream provider caused a node failure in our Asset Manager infrastructure, resulting in the temporary loss of indexed asset data. # What went wrong Several things went wrong during this incident: * Our service provider failed to notify us of the maintenance window * Our Asset Manager infrastructure was not adequately provisioned to tolerate failures during the maintenance * A large amount of data was required to be restored, resulting in a severely prolonged restoration process # What we are doing to address this incident * We have migrated and optimized infrastructure configurations to better tolerate node failures * We created additional backups and introduced additional indexing layers to improve redundancy and resilience * We are evaluating alternative upstream providers to reduce dependency risks