Keepit experienced a major incident on May 6, 2022 affecting User Interface and API, lasting 6h 39m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating May 06, 2022, 07:33 AM UTC
We are seeing significant problems reaching the Copenhagen (dk-co) region and using the user interface. This affects customers ability to access datasets. At the moment the root cause is unknown, we are investigating and will update this incident ongoingly.
- identified May 06, 2022, 09:30 AM UTC
We are temporarily cancelling a number of running backup jobs in order to be able to re-prioritise resources. You may receive e-mail notifications of cancelled backup jobs - this will be the cause of that. Cancelled backups will be automatically restarted shortly so this is a rather safe operation. The focus right now is on re-establishing normal access to the platform.
- identified May 06, 2022, 11:10 AM UTC
We have now been reliably back online for more than an hour after decreasing resource consumption across the network. Work is now undergoing to allow us to ramp back to normal production inflow - at the moment new backup jobs are still being started at a lower rate than usual. Further outages are not expected, but it is clear that we are currently operating under higher risk of unexpected interruptions. We will continue to update this issue as the situation evolves.
- resolved May 06, 2022, 02:12 PM UTC
The reachability problems have been completely cured for some time now. We will of course continue to monitor the situation as we work on the remaining issue of ramping back the normal backup ingest.