DRACOON incident

Partial Outage of Storage Upload

Major Resolved View vendor source →

DRACOON experienced a major incident on August 22, 2024 affecting Upload, lasting 3h 19m. The incident has been resolved; the full update timeline is below.

Started
Aug 22, 2024, 10:58 AM UTC
Resolved
Aug 22, 2024, 02:18 PM UTC
Duration
3h 19m
Detected by Pingoru
Aug 22, 2024, 10:58 AM UTC

Affected components

Upload

Update timeline

  1. investigating Aug 22, 2024, 10:58 AM UTC

    We are currently investigating an issue with our storage. Our team is working to gather more information and resolve the issue as quickly as possible. We apologize for any inconvenience this may cause and will provide updates as soon as we have them.

  2. monitoring Aug 22, 2024, 11:18 AM UTC

    The issue with storage has been resolved, and we are monitoring the situation to ensure it remains stable. We apologize for any inconvenience this may have caused and appreciate your patience.

  3. resolved Aug 22, 2024, 02:18 PM UTC

    The issue with storage has been fully resolved. All systems are now operating normally. We apologize for any inconvenience this may have caused and appreciate your patience. If you continue to experience any issues, please don't hesitate to reach out to our support team for assistance.

  4. postmortem Sep 19, 2024, 09:22 AM UTC

    We experienced an issue with the DRACOON Cloud on 22.08.2024 from 12:30 - 15:00. Our team has worked diligently to identify the root cause and implement a resolution. In this post-mortem, we want to share the details of what happened, why it happened, what we did to resolve it, and what we will do to prevent similar incidents in the future. What happened? There has been a partial outage of our Upload Functionality during the incident window. Only a part of our customers was affected. Why did this happen? There happened to be some issues on one of our backend services, which resulted in failing to complete uploads for some of our customers. What did we do? Our incident response team quickly identified the underlying issue and resolved it by restarting and increasing the amount of instances of the affected service. What can we do to improve? We will further investigate the issue and improve the affected service and its monitoring to prevent similar issues in the future. We apologize for any inconvenience this incident may have caused. We are committed to ensuring the stability and reliability of our services and will continue to take proactive measures to prevent similar incidents from happening in the future. If you have any questions or concerns, please don't hesitate to reach out to our support team for assistance.