TeamDynamix incident

Conversational AI: degraded performance

Minor Resolved View vendor source →

TeamDynamix experienced a minor incident on April 9, 2025 affecting Flow Execution, lasting 7h 6m. The incident has been resolved; the full update timeline is below.

Started
Apr 09, 2025, 01:01 PM UTC
Resolved
Apr 09, 2025, 08:08 PM UTC
Duration
7h 6m
Detected by Pingoru
Apr 09, 2025, 01:01 PM UTC

Affected components

Flow Execution

Update timeline

  1. investigating Apr 09, 2025, 01:01 PM UTC

    We have noticed Azure OpenAI Service and Azure AI Services currently show elevated latency in East US since 11:20pm EST on 4/8. MS indicates a fix is in process but no estimated time to fix. We're only seeing this impact *some* Conversational AI customers, but for any impacted, your bots MAY see degradation around usage of AI Knowledge Sets, bot translations, and processes utilizing the “Generate” step. Knowledge set jobs are also impacted.

  2. identified Apr 09, 2025, 03:51 PM UTC

    The underlying issue is still in the process of being fixed by Microsoft, however we've noticed also that bot training & intent matching may be potential impacted areas.

  3. identified Apr 09, 2025, 04:07 PM UTC

    The following communication was received from Microsoft this morning regarding an ongoing Conversational AI bot behavior issue that TDX has recognized: ~~~~~ Impact Statement: Starting at 02:30 UTC on 09 April 2025, you/TeamDynamix were identified as a customer using Azure OpenAI Service and Azure AI Services in East US, who may experience elevated latencies and availability issues across all AOAI models. Current Status: We determined that a recent deployment impacted backend instances upon which Azure OpenAI Service and Azure AI Services rely, which is leading to latency and availability problems. Our mitigation strategy consisted of performing a rollback to a previous, known build; however, it has not proven effective at alleviating impact. We are now focusing our efforts on identifying an appropriate mitigation path to resolve this issue. The next update will be provided in 2 hours or as events warrants. ~~~~~ Please note the Azure status page shows no issues unless signed in as an Azure Service User. TeamDynamix will continue to monitor their progress and will advise as soon as they indicate success in implementing a fix or other mitigating measures. At this time, the behaviors noted previously are being caused by this Azure OpenAI and AI Services issue (Azure tracking ID 1T3L-3QG). To this point, we've noted issues with usage of AI Knowledge Sets, bot translations, and processes utilizing the “Generate” step, Knowledge set jobs, bot training & intent matching. This is only affecting some Conversational AI customers/environments from our best assessment.

  4. monitoring Apr 09, 2025, 05:29 PM UTC

    Azure has notified us that this issue has been resolved through mitigating steps. As we receive that specific resolution information from them, we will provide it here on this incident.

  5. resolved Apr 09, 2025, 08:08 PM UTC

    Closing as the issue has been confirmed as resolved. Further details will be added when/if we receive them from Azure on the issue.