DRACOON incident

Partial Outage of Storage (Up- and Download)

DRACOON experienced a major incident on February 29, 2024 affecting Upload and Download, lasting 5h 3m. The incident has been resolved; the full update timeline is below.

Started: Feb 29, 2024, 09:21 AM UTC
Resolved: Feb 29, 2024, 02:25 PM UTC
Duration: 5h 3m
Detected by Pingoru: Feb 29, 2024, 09:21 AM UTC

Affected components

UploadDownload

Update timeline

investigating Feb 29, 2024, 09:21 AM UTC

We are currently investigating an issue with our Storage. Our team is working to gather more information and resolve the issue as quickly as possible. We apologize for any inconvenience this may cause and will provide updates as soon as we have them.
identified Feb 29, 2024, 09:51 AM UTC

We have identified the issue with our Storage. Our engineers are working on a resolution and we are confident that we will have a fix in place soon. We will provide more information and updates as soon as they become available.
monitoring Feb 29, 2024, 01:29 PM UTC

The issue with Storage has been resolved, and we are monitoring the situation to ensure it remains stable. We apologize for any inconvenience this may have caused and appreciate your patience.
resolved Feb 29, 2024, 02:25 PM UTC

The issue with Storage has been fully resolved. All systems are now operating normally. We apologize for any inconvenience this may have caused and appreciate your patience. If you continue to experience any issues, please don't hesitate to reach out to our support team for assistance.
postmortem Mar 13, 2024, 06:42 AM UTC

We experienced an issue with **up- and downloads for hybrid customers using their own S3 storage** on **29.02.2024 after updating one of our backend components the night before** . Our team has worked diligently to identify the root cause and implement a resolution. In this post-mortem, we want to share the details of what happened, why it happened, what we did to resolve it, and what we will do to prevent similar incidents in the future. What happened? **After our latest update during our maintenance window, up- and downloads for customers with own S3 storages were not possible** Why did this happen? **One of our automated quality checks introduced a too restrictive policy without context, leading to a software change which resulted in non-working up- and downloads for hybrid customers.** What did we do? **Our incident response team quickly identified the underlying issue and provided an updated version of the affected component.** What can we do to improve? **We will extend our automated test cases covering hybrid setups to identify such issues before a production rollout.** We apologize for any inconvenience this incident may have caused. We are committed to ensuring the stability and reliability of our services and will continue to take proactive measures to prevent similar incidents from happening in the future. If you have any questions or concerns, please don't hesitate to reach out to our support team for assistance.