Treasure Data incident
[US region] Ingest API - Performance Degradation
Treasure Data experienced a minor incident on April 21, 2025 affecting Streaming Import REST API, lasting 1d 19h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Apr 21, 2025, 04:10 PM UTC
Our Ingest API is experiencing a performance issue. We are investigating the cause.
- investigating Apr 21, 2025, 04:25 PM UTC
We are continuing to investigate this issue.
- identified Apr 21, 2025, 04:40 PM UTC
We have applied a fix and are observing it. We have identified a window between Thursday 3AM UTC and 11AM Monday UTC where data was not being processed. We have implemented a remediation and new incoming data is not impacted. We are actively working to resume sending of this data, which will arrive out of order along with new incoming data.
- identified Apr 21, 2025, 05:43 PM UTC
We are now attempting to restore the pipeline from a timestamp rather than a checkpoint. There is no risk of user data loss. Downstream systems may observe duplicate events from the affected time window, and this will be limited in scope.
- monitoring Apr 21, 2025, 06:59 PM UTC
We have rolled out a fix and are observing our backlog recovering at a safe rate. We are currently processing a backlog of historical data, which is progressing at a controlled rate. Estimated time to a full recover of all data is approximately 32 hours, and we will be monitoring and provide updates through this process.
- monitoring Apr 22, 2025, 10:15 AM UTC
We have been monitoring the situation for the past 6 hours. At this time, we are unable to provide an updated estimate for full recovery. We are continuing to monitor closely and are also exploring additional measures to improve recovery performance. We will share another update in approximately 6 hours or sooner if there is any significant development. We apologize for the inconvenience.
- monitoring Apr 22, 2025, 04:39 PM UTC
Recovery is proceeding and we are continuing to closely monitor the progress. We will have another update in six hours.
- monitoring Apr 23, 2025, 12:44 AM UTC
We are continuing to run our recovery operation, and it is progressing. We are monitoring closely and will update in six hours.
- resolved Apr 23, 2025, 11:30 AM UTC
We have continued our recovery efforts. While the issue has not been fully resolved, the impact is now limited. We are directly contacting affected customers and will continue to provide updates as needed. As this issue is now affecting only a limited number of customers, we are closing this status page. We sincerely apologize for the inconvenience this has caused and appreciate your understanding as we continue to support those still impacted. Thank you for your patience and understanding.