Flatfile incident

Uploads Are Unresponsive

Major Resolved View vendor source →

Flatfile experienced a major incident on June 11, 2025 affecting Spaces, lasting 5h 16m. The incident has been resolved; the full update timeline is below.

Started
Jun 11, 2025, 05:00 PM UTC
Resolved
Jun 11, 2025, 10:17 PM UTC
Duration
5h 16m
Detected by Pingoru
Jun 11, 2025, 05:00 PM UTC

Affected components

Spaces

Update timeline

  1. investigating Jun 11, 2025, 05:00 PM UTC

    We are currently investigating this issue.

  2. identified Jun 11, 2025, 05:24 PM UTC

    The issue has been identified and a fix is being implemented.

  3. identified Jun 11, 2025, 05:42 PM UTC

    We are continuing to work on a fix for this issue.

  4. monitoring Jun 11, 2025, 09:20 PM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Jun 11, 2025, 10:17 PM UTC

    This incident has been resolved.

  6. postmortem Jun 12, 2025, 02:36 PM UTC

    # Incident Overview A series of compounding events on the US region Platform led to network throttling of a network file system which severely degraded file extractions on the Platform. ## Details of the Incident At approximately 10 AM PDT on June 11, Flatfile experienced several conflating events that resulted in network bandwidth on our network file systems being overwhelmed and throttled by our cloud service provider. The event resulted from a spike in usage on the Platform that led to autoscaling events which dramatically increased read operations against the network file stores, eventually leading to exhaustion of throughput. This event led to a severe degradation in file extraction performance on the Platform for all customers. The initial response to the incident involved increasing capacity which compounded the problem. At approximately 14:20 PDT, the engineering team provisioned a large increase in throughput which ended the throttling and allowed file extraction jobs to proceed as expected. ## Impact Assessment The incident severely degraded all file extractions and led to increased latency on jobs as the queue became backlogged. ## Root Cause The throttling of throughput on the network file system was the cause of the incident; this was exacerbated by automated scaling operations that attempted to increase queue capacity. ## Resolution Flatfile engineering provisioned increased throughput on the file system to resolve the incident at 14:20 PDT. ## Security and Data Integrity Please be assured that this incident did not compromise the security or integrity of your data. Our commitment to data protection remains a top priority.