SAP Conversational AI incident

Partial Outage: design time restrictions with forking and version creation

Major Resolved View vendor source →

SAP Conversational AI experienced a major incident on September 25, 2020 affecting [US10] Design Time, lasting 3h 16m. The incident has been resolved; the full update timeline is below.

Started
Sep 25, 2020, 07:07 PM UTC
Resolved
Sep 25, 2020, 10:23 PM UTC
Duration
3h 16m
Detected by Pingoru
Sep 25, 2020, 07:07 PM UTC

Affected components

[US10] Design Time

Update timeline

  1. investigating Sep 25, 2020, 07:07 PM UTC

    We are currently investigating an issue related to bot creation, bot forking, and version creation. These operations currently fail for intent- and entity-related data, leaving the bot / bot version in a non-working state. We are analyzing the issue and will update this incident with further information as it becomes available. Please note that the runtime is *not* impacted by this incident and productive bots continue to work as expected.

  2. investigating Sep 25, 2020, 08:04 PM UTC

    We are continuing the investigation with the issue suspected in the area of the worker jobs for dataset-related (intents, entities) tasks, specifically their connection to the database. We will update the issue as more information becomes available. In the meantime, we recommend to not create new bots, versions, or do forking operations as these would fail. We apologize for the inconvenience.

  3. monitoring Sep 25, 2020, 09:57 PM UTC

    We have identified the issue to be related to the workers responsible for forking and related operations not being able to acquire their database connection. We have fixed the issue and all design time operations should be fully functional again. We are monitoring the system for stability and will conduct a detailed root cause analysis to prevent related issues in the future.

  4. resolved Sep 25, 2020, 10:23 PM UTC

    This incident has been resolved and the platform is fully operational.