Dixa incident

Loading conversations not working

Major Resolved View vendor source →

Dixa experienced a major incident on July 1, 2025 affecting Agent Interface and Dashboard, lasting 14m. The incident has been resolved; the full update timeline is below.

Started
Jul 01, 2025, 09:01 AM UTC
Resolved
Jul 01, 2025, 09:15 AM UTC
Duration
14m
Detected by Pingoru
Jul 01, 2025, 09:01 AM UTC

Affected components

Agent InterfaceDashboard

Update timeline

  1. investigating Jul 01, 2025, 09:01 AM UTC

    We are currently investigating an issue with loading conversations.

  2. monitoring Jul 01, 2025, 09:04 AM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Jul 01, 2025, 09:15 AM UTC

    This incident has been resolved.

  4. postmortem Jul 04, 2025, 08:34 AM UTC

    **Incident Date:** July 1, 2025 **Incident Duration:** 10:49–11:15 AM UTC **Severity:** Major **Status:** Resolved ## Summary On the morning of July 1st, an infrastructure change aimed at improving the scalability of one of our core backend components caused a temporary disruption to key services, including the loading of conversations and analytics for several customers. ## Impact Between **10:49 AM and 11:01 AM UTC**, users experienced degraded service, including: * Inability to load Conversations and Analytics * Increased API latency and error rates * Service instability across core business functionalities The issue was identified quickly, and a rollback restored full functionality within 12 minutes. ## Timeline of Events | Time \(UTC\) | Event | | --- | --- | | **10:49 AM** | A change was deployed to our core. | | **10:53 AM** | Analytics service loading issues were reported. | | **10:54 AM** | Multiple customers reported that Conversations were not loading. | | **10:56 AM** | Engineers decided to roll back the change to the previous configuration. | | **11:01 AM** | Rollback completed. Services began functioning normally. | | **11:15 AM** | Incident was marked as resolved. API performance confirmed stable. | ## Immediate Fix The engineering team initiated a **rollback** to the previous provisioned capacity mode, restoring the prior configuration for both the table and its index. This resolved the throttling within minutes. ## Long-Term Preventive Actions To prevent similar incidents in the future, we are improving how configuration changes are tested and deployed: * **Improved Change Validation**: We’re enhancing our testing processes to better simulate real-world usage before changes are rolled out to production. * **Stronger Safeguards on Infrastructure Changes**: We’re updating how critical infrastructure components are configured to ensure changes do not unintentionally impact system performance. These improvements are part of our ongoing commitment to delivering a stable and reliable experience. ## Closing Notes We sincerely apologize for the disruption this caused. The lessons from this incident are already being acted on to further strengthen our systems and processes. Ensuring the reliability and scalability of our platform remains a top priority. If you have any questions or would like more details, please don’t hesitate to reach out to our support team.