Viedoc incident
Issues generating documents in Viedoc Designer on European servers
Viedoc experienced a notice incident on August 21, 2024 affecting Background processes (import/export/revisions/alerts/archive/disposal), lasting 2h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Aug 21, 2024, 10:04 AM UTC
There are currently issues with generating documents (blank CRFs, annotated CRFs and study workflow) in Viedoc Designer on the European training and production server. This is currently being investigated.
- resolved Aug 21, 2024, 12:05 PM UTC
This incident has been resolved. Documents can now be generated in Viedoc Designer as expected again. The issue was resolved in the European production instance around 10:30 UTC and for the European training instance around 11:55 UTC.
- postmortem Aug 23, 2024, 12:32 PM UTC
**Description** Between the late afternoon of 2024-08-20 and noon 2024-08-21 some of the documents \(blank CRF, annotated CRF, Workflow\) could not be generated in Viedoc Designer. Users who attempted to generate these documents were shown the document generation being stuck at 0%. Once the worker was re-started and corrupt print jobs were removed, the worker was functioning as expected again. **Instances affected** [https://v4.viedoc.net/](https://v4.viedoc.net/); [https://v4training.viedoc.net](https://v4training.viedoc.net) **Duration** Between the late afternoon of 2024-08-20 and noon 2024-08-21. **Cause** As preparation for some technical infrastructure upgrades, the designer worker handling print jobs was re-started in the afternoon of 2024-08-20. When re-started, a print job that was ongoing when the worker was re-started ended up in a corrupt state, not being possible to process by the worker. This blocked the queue and no other print jobs were executed in Designer. Around 9AM on 2024-08-21 the first report came in from a customer that the print jobs were not being processed. **Impact** Between re-starting the worker and resetting the status of the corrupt print job, no print jobs handled by this worker were processed. Once the status of the corrupt print job was reset, the queue was again being processed as expected. Thus, the impact for the customers was a delay in getting these specific print jobs completed. **Corrective action** When the print job queue was reviewed, the invalid print job blocking the queue was detected. This status of this job was then reset so that the worker would start over. After this action the print job queue was processed as normal again and print jobs that had previously not been completed were executed. **Preventive action** Since this issue was triggered by the manual intervention or re-starting the worker, the initial preventive action is just to manually monitor to see that print jobs start executing as expected in case the worker needs to be re-started again. For the future, improved automatic monitoring and re-try attempts will be evaluated.