- Detected by Pingoru
- Aug 18, 2025, 09:03 PM UTC
- Resolved
- Aug 18, 2025, 09:49 PM UTC
- Duration
- 45m
Affected: InsightsInsightsInsightsInsights
Timeline · 2 updates
-
investigating Aug 18, 2025, 09:03 PM UTC
Treasure Data Insights is currently unavailable; we are investigating this issue.
-
resolved Aug 18, 2025, 09:49 PM UTC
The issue was resolved; it was linked to a vendor issue. The vendor has resolved the problem that was impacting TD Insights.
Read the full incident report →
- Detected by Pingoru
- Jul 02, 2025, 02:53 AM UTC
- Resolved
- Jul 02, 2025, 08:03 AM UTC
- Duration
- 5h 10m
Affected: Insights
Timeline · 3 updates
-
investigating Jul 02, 2025, 02:53 AM UTC
We are currently investigating a slowdown affecting Treasure Insight on the AP02 site. As a result, you may experience longer than usual processing times, and some requests may time out or fail.
-
monitoring Jul 02, 2025, 07:17 AM UTC
We identified the cause of the performance degradation, and we fixed it. We confirmed our metrics are back to normal. We're keeping the monitor.
-
resolved Jul 02, 2025, 08:03 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Jun 23, 2025, 06:45 AM UTC
- Resolved
- Jun 23, 2025, 07:58 AM UTC
- Duration
- 1h 13m
Affected: CDP APICDP APICDP APICDP APICDP API
Timeline · 2 updates
-
identified Jun 23, 2025, 06:45 AM UTC
We are currently experiencing an issue with the CDP API concerning the retrieval of realtime attributes. When opening the settings screen or attempting to change settings, realtime attribute values cannot be retrieved, resulting in an error. This issue exclusively affects the realtime segment settings screen and does not impact the actual behavior of realtime segments. Our team is actively working to resolve this issue. We sincerely apologize for any inconvenience or disruption this may cause.
-
resolved Jun 23, 2025, 07:58 AM UTC
We are writing to inform you that the issue with realtime attribute retrieval in the CDP API occurred on June 23, 2025, at 04:46 UTC and was resolved on June 23, 2025, at 07:55 UTC. We sincerely apologize for any inconvenience this may have caused.
Read the full incident report →
- Detected by Pingoru
- Jun 01, 2025, 05:18 PM UTC
- Resolved
- Jun 02, 2025, 05:43 PM UTC
- Duration
- 1d
Affected: Streaming Import REST API
Timeline · 8 updates
-
investigating Jun 01, 2025, 04:17 PM UTC
Our Ingest API is experiencing a performance issue. We are investigating the cause.
-
investigating Jun 01, 2025, 04:35 PM UTC
Current Status: A fix has been applied and is currently under observation. Impact: An issue was identified with data ingestion(delayed ingestion) between Sunday 8:00 AM PST - 8:30 AM PST. Remediation: A fix has been implemented, and new incoming data is now processing normally. Next Steps: We are actively working to resume sending the affected data, which will arrive out of order along with new incoming data. Updates to Follow: Further details will be provided as the situation progresses.
-
investigating Jun 01, 2025, 05:18 PM UTC
Latest Update: Our team has deployed a fix, and the system is now processing the backlog of data at a controlled rate. Estimated Time to Full Recovery: ~7 hours We’re actively monitoring the recovery and will provide further updates as progress continues. Thank you for your patience.
-
investigating Jun 02, 2025, 01:49 AM UTC
Latest Update: Our team is still actively monitoring and assessing the processing of the data backlog. Thank you for your continued patience.
-
monitoring Jun 02, 2025, 04:47 AM UTC
Latest Update: Our internal graphs show that the overwhelming majority of the data has been processed. There is small amount of residual data that is taking longer than expected to process. We will leave this status page in a Monitoring state until we are certain everything has been processed. Thank you for your continued patience.
-
monitoring Jun 02, 2025, 04:26 PM UTC
Latest Update: Our team is still actively monitoring and assessing the processing of the data. We will leave this status page in a Monitoring state until we are certain everything has been processed. Thank you for your continued patience.
-
monitoring Jun 02, 2025, 05:37 PM UTC
We are continuing to monitor for any further issues.
-
resolved Jun 02, 2025, 05:43 PM UTC
This incident has been resolved. Duration: ~35 minutes of processing delay. Data Ingestion delay between June 1 - 7:35 AM to June 1 - 10:35 AM PST. Affected Customers: All customers ingesting data to AWS US region during the incident window Impact: Delayed data availability in Plazma (up to 40 mins) No data loss occurred
Read the full incident report →
- Detected by Pingoru
- Apr 15, 2025, 07:55 AM UTC
- Resolved
- Apr 15, 2025, 10:27 AM UTC
- Duration
- 2h 32m
Affected: Presto Query Engine
Timeline · 4 updates
-
investigating Apr 15, 2025, 07:55 AM UTC
Our Trino service is experiencing an issue. We are investigating the cause.
-
identified Apr 15, 2025, 08:09 AM UTC
The root cause has been identified and we are applying the fix.
-
monitoring Apr 15, 2025, 08:48 AM UTC
We are observing recovery. We continue to monitor for full recovery.
-
resolved Apr 15, 2025, 10:27 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Apr 09, 2025, 01:24 AM UTC
- Resolved
- Apr 09, 2025, 03:22 AM UTC
- Duration
- 1h 58m
Affected: Data Connector IntegrationsHadoop / Hive Query Engine
Timeline · 3 updates
-
investigating Apr 09, 2025, 01:24 AM UTC
We are currently investigating an issue where Hive Jobs and the Result Export jobs triggered from Hive Jobs are not functioning properly. Our team is actively looking into the cause of the issue. We will provide updates as soon as more information becomes available.
-
monitoring Apr 09, 2025, 01:49 AM UTC
We have identified the root cause and applied a hotfix. The issue was that Hive Jobs were unable to start properly, which in turn caused some Result Export jobs triggered from those Hive Jobs to fail. We are currently monitoring the system to ensure that the situation continues to improve.
-
resolved Apr 09, 2025, 03:22 AM UTC
We have confirmed that Hive jobs are now working properly, and the related Result Export jobs are functioning as expected. The impact occurred between April 8, 16:00 UTC and April 9, 01:35 UTC. If you had any jobs that failed during the affected time window, please re-run them as needed. We sincerely apologize for the inconvenience this my have caused.
Read the full incident report →
- Detected by Pingoru
- Feb 11, 2025, 05:48 PM UTC
- Resolved
- Feb 11, 2025, 09:47 PM UTC
- Duration
- 3h 58m
Timeline · 4 updates
-
investigating Feb 11, 2025, 05:48 PM UTC
Our utilization dashboards do not update with up-to-date information. Customers accessing their Treasure Data usage dashboards will see a gap in usage details from early Saturday morning UTC. There is no impact on ongoing Treasure Data usage, and all usage information is correctly stored internally. However, the dashboard where customers can view their consumption is not up-to-date. Note this is a reporting problem only. There is no indication of any issues with regular Treasure Data usage. We are working to diagnose the issue and will provide an update in the next hour.
-
identified Feb 11, 2025, 07:05 PM UTC
We have identified the cause of this issue and are working to restore service. Once the service is restored, we expect it will take a few hours to catch up on usage data for the last few days.
-
monitoring Feb 11, 2025, 08:34 PM UTC
We have remediated the issue and are processing usage data from the last 4 days. Users should see the usage dashboards catching up. We expect this process to take about another hour to complete.
-
resolved Feb 11, 2025, 09:47 PM UTC
This issue is resolved and our utilization dashboards should be showing up-to-date information. If you observe anything unusual in your usage data, please contact our support team. Thank you for your patience while working through this issue.
Read the full incident report →
- Detected by Pingoru
- Jan 30, 2025, 10:54 AM UTC
- Resolved
- Jan 30, 2025, 03:43 PM UTC
- Duration
- 4h 49m
Affected: CDP APICDP Personalization - Lookup APICDP Personalization - Ingest API
Timeline · 5 updates
-
investigating Jan 30, 2025, 10:54 AM UTC
We detected degraded performance of personalization API and an error rate increase. We are currently investigating this issue.
-
monitoring Jan 30, 2025, 11:38 AM UTC
We are currently observing that the performance degradation and error rate have improved. We continue to closely monitor the metrics.
-
monitoring Jan 30, 2025, 12:31 PM UTC
We are continuing to monitor for any further issues.
-
monitoring Jan 30, 2025, 02:18 PM UTC
We are still monitoring the service. Between Thursday, 30 Jan 2025, 10:00 UTC to 11:05 UTC, customers experienced elevated error rates and longer latency for Profiles API lookup. Currently, the cluster workload has calmed down and is operating normally. Our response team is ready to provision additional processing capacity. However, we are closely monitoring the service status to avoid further downtime during peak times. In addition to it, we are working on isolating problematic accesses from the service. We will keep the status page open and update you on the progress.
-
resolved Jan 30, 2025, 03:43 PM UTC
We implemented fundamental isolation to a problematic configuration at 14:42 UTC. The remediation caused the cluster workload to drop from 60% to 1%. On Friday, we implemented write access isolation to the problematic configuration. It stopped the cluster workload from growing. Today, we implemented read access isolation that restored the cluster workload to the previous level. The system is operating normally now. We close the incident. We acknowledge we need further actions to prevent the same incident from happening again by a similar configuration. We will post further postmortem when we are ready.
Read the full incident report →
- Detected by Pingoru
- Jan 23, 2025, 07:34 AM UTC
- Resolved
- Jan 23, 2025, 12:18 PM UTC
- Duration
- 4h 44m
Affected: CDP Personalization - Lookup APICDP Personalization - Ingest API
Timeline · 6 updates
-
investigating Jan 23, 2025, 07:34 AM UTC
We are currently observing errors or performance degradation for the personalization API. We are investigating the cause of the issue now.
-
identified Jan 23, 2025, 08:05 AM UTC
The response team confirmed the symptom is from the same cause as the previous incidents. We are provisioning additional concurrency capacity to the environment. We will update you when it is completed.
-
identified Jan 23, 2025, 10:37 AM UTC
We provisioned additional capacity at 10:00 am UTC to support the increasing workload. It improved the latency, but we still observed errors and long latency for a small amount of requests. The response team started providing another concurrency capacity. Unlike the previous methods, the new process should not take longer for provisioning. We will update the result in 30 minutes.
-
identified Jan 23, 2025, 11:08 AM UTC
We successfully provisioned 2x capacity in 30 minutes. New resources improved the latency, but the error rate is still high. The response team is planning to implement another remediation instead of adding resources. We will update you in 30 minutes.
-
monitoring Jan 23, 2025, 12:16 PM UTC
The response team found problematic real-time segment configurations of one customer's Parent Segment that possibly contributed to consuming the concurrency capacity. The team updated the real-time event routing configuration to mitigate the high latency issue. Combined with capacity addition operations, the team stabilized the Profiles API cache cluster. If you experience any delays or abnormal errors, please reach out to our support team. Thank you for your patience and understanding during this incident. We will update the postmortem with further remediation plan as promised.
-
resolved Jan 23, 2025, 12:18 PM UTC
Between Thursday, 23 Jan 2025 07:20 UTC to 11:40 UTC, customers experienced elevated error rates and increased latency related to Profiles API. A fix has been implemented, and the issue has been resolved. If you experience any delays or abnormal errors, please reach out to our support team. Thank you for your patience and understanding during this incident. We will share an incident retrospective soon.
Read the full incident report →
Critical January 20, 2025 - Detected by Pingoru
- Jan 20, 2025, 01:34 PM UTC
- Resolved
- Jan 20, 2025, 04:04 PM UTC
- Duration
- 2h 30m
Affected: InsightsInsightsInsightsInsights
Timeline · 6 updates
-
investigating Jan 20, 2025, 01:34 PM UTC
We have observed that the users are not able to access the Treasure Insights. We are currently investigating the issue.
-
investigating Jan 20, 2025, 02:17 PM UTC
We are still investigating the issue.
-
investigating Jan 20, 2025, 02:49 PM UTC
The situation remains the same as it was in the last update.
-
identified Jan 20, 2025, 02:58 PM UTC
The problem has been identified, and we are currently working on a solution.
-
monitoring Jan 20, 2025, 03:44 PM UTC
A fix has been implemented, and we are monitoring the service to ensure everything is functioning correctly.
-
resolved Jan 20, 2025, 04:04 PM UTC
We would like to inform you that the issue has been fully resolved. Incident Impact Details: - The Treasure Insights were returning 502 errors and it was unreachable during the incident. Incident Impact Time: - Start: January 20, 09:51 UTC - End: January 20, 15:35 UTC We apologize for any inconvenience this may have caused and thank you for your patience and understanding.
Read the full incident report →
- Detected by Pingoru
- Jan 20, 2025, 10:33 AM UTC
- Resolved
- Jan 20, 2025, 11:41 AM UTC
- Duration
- 1h 8m
Affected: CDP APICDP Personalization - Lookup APICDP Personalization - Ingest API
Timeline · 4 updates
-
investigating Jan 20, 2025, 10:33 AM UTC
We are currently investigating this issue.
-
monitoring Jan 20, 2025, 10:35 AM UTC
Through our investigation, we identified the cause of the issue and have applied some remediation. Our team is closely monitoring the system to ensure continued stability.
-
monitoring Jan 20, 2025, 11:10 AM UTC
We are observing fewer errors now. However, we are still monitoring and re-evaluating the remedial steps to confirm better performance.
-
resolved Jan 20, 2025, 11:41 AM UTC
We would like to inform you that the issue has been fully resolved. Incident Impact Details: - Personalization API has experienced an outage leading to increased errors and timeouts. Incident Impact Time: - Start: January 20, 07:45 UTC - End: January 20, 11:15 UTC We apologize for any inconvenience this may have caused and thank you for your patience and understanding.
Read the full incident report →
- Detected by Pingoru
- Dec 08, 2024, 07:48 PM UTC
- Resolved
- Dec 08, 2024, 08:27 PM UTC
- Duration
- 39m
Affected: Presto Query Engine
Timeline · 3 updates
-
investigating Dec 08, 2024, 07:48 PM UTC
Some users may be experiencing degraded performance when running presto or trino jobs. We are investigating the incident. At present all users in the EU central region may be affected.
-
monitoring Dec 08, 2024, 08:05 PM UTC
A fix has been implemented and we are monitoring the results.
-
resolved Dec 08, 2024, 08:27 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Nov 13, 2024, 07:09 AM UTC
- Resolved
- Nov 13, 2024, 10:16 AM UTC
- Duration
- 3h 7m
Affected: CDP Personalization - Lookup APICDP Personalization - Lookup APICDP Personalization - Lookup APICDP Personalization - Lookup APICDP Personalization - Lookup APICDP Personalization - Ingest APICDP Personalization - Ingest APICDP Personalization - Ingest APICDP Personalization - Ingest APICDP Personalization - Ingest API
Timeline · 3 updates
-
investigating Nov 13, 2024, 07:09 AM UTC
Since approximately 4:00 UTC, we have been experiencing an issue with requests to CDP KVS, which may be affecting Profiles API functionality, causing delays in KVS data synchronization and updates to real-time segment information. Our team is actively investigating and working to resolve the issue as quickly as possible. Please note that Realtime 2.0 is not affected.
-
monitoring Nov 13, 2024, 09:17 AM UTC
Through our investigation, we identified the cause of the issue as a recent release operation. We have reverted all changes from this release, and normal functionality has been restored. Our team is closely monitoring the system to ensure continued stability.
-
resolved Nov 13, 2024, 10:16 AM UTC
We would like to inform you that the issue has been fully resolved. Incident Impact Details: - Profiles API experienced an increased frequency of errors and timeouts. - The latest logs were not reflected in real-time segments. Incident Impact Time by Region: us: - Start: November 13, 04:14 UTC - End: November 13, 08:55 UTC aws-tokyo: - Start: November 13, 04:14 UTC - End: November 13, 08:54 UTC eu01: - Start: November 13, 04:17 UTC - End: November 13, 08:51 UTC ap02 - Start: November 13, 04:15 UTC - End: November 13, 09:01 UTC ap03 - Start: November 13, 04:17 UTC - End: November 13, 08:52 UTC We apologize for any inconvenience this may have caused and thank you for your patience and understanding.
Read the full incident report →
- Detected by Pingoru
- Nov 05, 2024, 06:24 PM UTC
- Resolved
- Nov 05, 2024, 07:26 PM UTC
- Duration
- 1h 1m
Affected: Presto Query Engine
Timeline · 4 updates
-
investigating Nov 05, 2024, 06:24 PM UTC
We are investigating a possible problem currently causing escalated error rates from presto queries. We will provide an update as soon as we know more.
-
investigating Nov 05, 2024, 06:49 PM UTC
We are continuously investigating this issue. For most queries we expect they will succeed after one or more automatic retries
-
monitoring Nov 05, 2024, 07:05 PM UTC
We have applied a fix. The problem looks to be resolved, but we are continuing to monitor.
-
resolved Nov 05, 2024, 07:26 PM UTC
Between Nov 5, 17:15 UTC and Nov 5, 18:45 UTC, Some customers experienced delays and errors related to presto. The cause was insufficient capacity, which will be investigated further. A fix has been implemented and the issue has been resolved. We apologize for any inconvenience caused. If you have any questions about it, please contact [email protected]
Read the full incident report →
- Detected by Pingoru
- Oct 16, 2024, 12:12 PM UTC
- Resolved
- Oct 16, 2024, 01:15 PM UTC
- Duration
- 1h 3m
Affected: Presto Query Engine
Timeline · 3 updates
-
investigating Oct 16, 2024, 12:12 PM UTC
We are investigating a possible problem currently affecting Trino/Presto queries for the US region. Queries might have degraded performance. We will provide an update as soon as we know more details.
-
monitoring Oct 16, 2024, 12:41 PM UTC
We have applied remediation for the degraded performance infrastructure. We are currently monitoring the performance closely.
-
resolved Oct 16, 2024, 01:15 PM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Oct 02, 2024, 12:04 AM UTC
- Resolved
- Oct 02, 2024, 01:30 AM UTC
- Duration
- 1h 25m
Affected: Streaming Import REST APIMobile/Javascript REST APIData Connector IntegrationsHadoop / Hive Query EnginePresto Query EnginePresto JDBC/ODBC Gateway
Timeline · 8 updates
-
investigating Oct 02, 2024, 12:04 AM UTC
We're experiencing an elevated level of API errors and are currently looking into the issue.
-
investigating Oct 02, 2024, 12:06 AM UTC
We are continuing to investigate this issue.
-
investigating Oct 02, 2024, 12:07 AM UTC
We are continuing to investigate this issue.
-
identified Oct 02, 2024, 12:41 AM UTC
The issue has been identified and a fix is being implemented.
-
monitoring Oct 02, 2024, 01:01 AM UTC
A fix has been implemented and we are monitoring the results.
-
monitoring Oct 02, 2024, 01:15 AM UTC
We are continuing to monitor for any further issues.
-
resolved Oct 02, 2024, 01:30 AM UTC
This incident has been resolved.
-
postmortem Oct 02, 2024, 04:34 AM UTC
We experienced a temporary overload on the storage layer. It started from 16:15 PDT and fixed on 18:15 PDT. The major impact was performance defgadation for data ingestion components \(Streaming Import REST API, Mobile/Javascript REST API, Data Connector\) and Hive and Presto query engines. Some of queries executed on Hive and Presto failed because of performance degradation of the storage.
Read the full incident report →
- Detected by Pingoru
- Sep 20, 2024, 02:20 AM UTC
- Resolved
- Sep 20, 2024, 06:28 AM UTC
- Duration
- 4h 8m
Affected: Web InterfaceWeb InterfaceWeb InterfaceWeb InterfaceWeb Interface
Timeline · 3 updates
-
identified Sep 20, 2024, 02:20 AM UTC
We observed a problem with web console access related to showing Standard Audit Logs. We have found the cause of the incident. We are working to resolve the incident.
-
monitoring Sep 20, 2024, 05:44 AM UTC
We confirm the issue was resolved. We will continue to monitor the results.
-
resolved Sep 20, 2024, 06:28 AM UTC
This incident has been resolved.
Read the full incident report →
- Detected by Pingoru
- Sep 04, 2024, 10:22 PM UTC
- Resolved
- Sep 05, 2024, 01:18 PM UTC
- Duration
- 14h 56m
Affected: Presto Query Engine
Timeline · 7 updates
-
investigating Sep 04, 2024, 10:22 PM UTC
We are investigating a possible problem currently affecting Presto. Queries could be delayed. We will provide an update as soon as we know more.
-
monitoring Sep 04, 2024, 11:19 PM UTC
A fix has been implemented and we are monitoring the results.
-
investigating Sep 05, 2024, 01:48 AM UTC
This incident is still ongoing. We are investigating the root cause.
-
investigating Sep 05, 2024, 03:53 AM UTC
Though not all, the performance for some queries has been improved. We are continuing to investigate the issue.
-
monitoring Sep 05, 2024, 04:44 AM UTC
We applied the fix. We will continue to monitor the results.
-
monitoring Sep 05, 2024, 08:28 AM UTC
Systems should be back to normal but we continue to monitor the situation for a while.
-
resolved Sep 05, 2024, 01:18 PM UTC
The incident is now resolved. All affected components are back to normal. A subset of customers in the US region might have experienced degraded performance on Presto queries between 4:50 PM EDT and 1:40 AM EDT. Presto queries might also have been queued for longer than usual during the incident. Finally, some queries might have failed due to the remediations that were put in place.
Read the full incident report →
- Detected by Pingoru
- Aug 29, 2024, 10:14 AM UTC
- Resolved
- Aug 29, 2024, 10:59 AM UTC
- Duration
- 44m
Affected: WorkflowWorkflow
Timeline · 3 updates
-
investigating Aug 29, 2024, 10:14 AM UTC
Custom Script from workflow fails due to an ongoing incident with our infrastructure provider (AWS). Error example: Unable to execute HTTP request: Connect to sts.amazonaws.com:443 [sts.amazonaws.com/209.54.177.164] failed: connect timed out com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to sts.amazonaws.com:443 [sts.amazonaws.com/209.54.177.164] failed: connect timed out We are actively working on the issue on our end.
-
monitoring Aug 29, 2024, 10:30 AM UTC
The error rate is decreased. Please rerun the failed workflow if needed. We observed errors with the custom script between 8 am and 10 am UTC on August 29th. We keep monitoring the issue carefully.
-
resolved Aug 29, 2024, 10:59 AM UTC
Our infrastructure provider (AWS) issue is resolved and we don't observe new errors for now. Please rerun the failed workflow if needed
Read the full incident report →
- Detected by Pingoru
- Aug 29, 2024, 10:07 AM UTC
- Resolved
- Aug 29, 2024, 11:00 AM UTC
- Duration
- 53m
Affected: Data Connector IntegrationsData Connector IntegrationsHadoop / Hive Query EngineHadoop / Hive Query Engine
Timeline · 2 updates
-
investigating Aug 29, 2024, 10:07 AM UTC
We have found that Data Connector, Result Export, and Hive jobs weren't able to start or failed the job due to an incident with our infrastructure provider (AWS). Some of the Data Connector, ResultExport, and Hive jobs might encounter delay or error. The issue observed on Aug 29th between 8:30 UTC - 9:45 UTC We are still investigating the issue on our end.
-
resolved Aug 29, 2024, 11:00 AM UTC
Our infrastructure provider (AWS) issue is resolved and we don't observe new errors for now. Some of the jobs failed due to the incident, so please rerun the failed jobs if needed
Read the full incident report →
- Detected by Pingoru
- Aug 06, 2024, 03:32 PM UTC
- Resolved
- Aug 07, 2024, 02:23 PM UTC
- Duration
- 22h 50m
Affected: Streaming Import REST APICDP Personalization - Ingest API
Timeline · 10 updates
Read the full incident report →
- Detected by Pingoru
- Jul 30, 2024, 11:52 PM UTC
- Resolved
- Jul 31, 2024, 07:53 AM UTC
- Duration
- 8h
Affected: Data Connector Integrations
Timeline · 5 updates
-
identified Jul 30, 2024, 11:52 PM UTC
We are currently experiencing a high error rate in Custom Script service on Treasure Workflow (US Region) due to an ongoing incident with our infrastructure provider (AWS). This issue is increased error rates with the following error message like: > Task failed with unexpected error: null (Service: AWSLogs; Status Code: 503; Error Code: null; Request ID: xxxxxx; Proxy: null) At this time, we do not have an estimated time for full resolution. We will provide further updates as soon as more information becomes available.
-
identified Jul 31, 2024, 02:24 AM UTC
This issue is still ongoing, we are still seeing custom script tasks fail. Custom script user may also encounter some errors about AWS Cloud Watch logs. According to our infrastructure provider (AWS), they are working on recovery and there are some improvements being seen internally, but they expect it to take 1-2 hours for full recovery. We will provide further updates as soon as more information becomes available.
-
identified Jul 31, 2024, 03:16 AM UTC
Due to the degradation of Amazon Ads system https://status.ads.amazon.com, our connectors for Amazon Ads platform are currently not working properly. So if you are using any of the below connectors, your jobs may not be running correctly. - Amazon Marketing Cloud export - Amazon Marketing Cloud import - Amazon Ads export - Amazon DSP export We will provide further updates as soon as more information becomes available.
-
monitoring Jul 31, 2024, 05:25 AM UTC
According to our infrastructure provider (AWS), this issue has already been resolved. We also see that the failure rate has been reduced, so we will update this incident to Monitoring status and the affected components to Operational status.
-
resolved Jul 31, 2024, 07:53 AM UTC
This incident has been resolved, all affected components (Custom Script and some DataConnector) are now back to normal.
Read the full incident report →
- Detected by Pingoru
- Jun 26, 2024, 02:43 AM UTC
- Resolved
- Jun 26, 2024, 03:43 AM UTC
- Duration
- 1h
Affected: Streaming Import REST APIMobile/Javascript REST APICDP Personalization - Ingest APIADL
Timeline · 3 updates
-
investigating Jun 26, 2024, 02:43 AM UTC
We are monitoring delays in systems responsible for processing incoming ingested events using our ingestion API. There also increased errors in the ingestion API. The delay is caused by infrastructure issues in our provider, which are currently being addressed. We are monitoring the situation. During this time, writing to storage may be delayed, but there is no evidence of data loss.
-
monitoring Jun 26, 2024, 02:43 AM UTC
We are in constant communication with our service provider.
-
resolved Jun 26, 2024, 03:43 AM UTC
The issue is resolved at the provider and all components have completed catch-up.
Read the full incident report →
- Detected by Pingoru
- Jun 25, 2024, 01:28 PM UTC
- Resolved
- Jun 25, 2024, 01:28 PM UTC
- Duration
- —
Affected: CDP Personalization - Lookup APICDP Personalization - Ingest API
Timeline · 1 update
-
resolved Jun 25, 2024, 01:28 PM UTC
Between 2:02 a.m. and 5:47 a.m. PDT, the CDP Personalization API experienced elevated API error rates. The engineering team identified the computing instance causing the issue and implemented a fix. The problem has been resolved already. The Personalization API clients that equip error retry observed no issue. We apologize for any inconvenience caused. If you have any questions about it, please contact [email protected]
Read the full incident report →
- Detected by Pingoru
- Jun 20, 2024, 04:33 AM UTC
- Resolved
- Jun 20, 2024, 04:52 AM UTC
- Duration
- 18m
Affected: Hadoop / Hive Query EngineHadoop / Hive Query EngineHadoop / Hive Query EngineHadoop / Hive Query EngineHadoop / Hive Query Engine
Timeline · 2 updates
-
monitoring Jun 20, 2024, 04:33 AM UTC
All Hive jobs excluding CDP Workflow run on Hive4 (query engine 2023.1) during the following time period. - [US Region] 2024-06-19 07:45 +0000 - 2024-06-20 04:06 +0000 - [Tokyo Region] 2024-06-19 09:05 +0000 - 2024-06-20 04:08 +0000 - [EU Region] 2024-06-19 09:14 - 2024-06-20 04:09 +0000 - [Korea Region] 2024-06-19 09:21 +0000 - 2024-06-20 04:10 +0000 - [AP03 Region] 2024-06-19 09:30 +0000 - 2024-06-20 04:11 +0000 We have fixed it and all Hive jobs are now properly executed on the query engine specified by the user. We apologize for the inconvenience.
-
resolved Jun 20, 2024, 04:52 AM UTC
Verified that the issue is completely resolved. We apologize for the inconvenience.
Read the full incident report →