meddbase incident

System availability

Critical Resolved View vendor source →

meddbase experienced a critical incident on September 24, 2024 affecting Application Access, lasting 3h 22m. The incident has been resolved; the full update timeline is below.

Started
Sep 24, 2024, 04:13 PM UTC
Resolved
Sep 24, 2024, 07:35 PM UTC
Duration
3h 22m
Detected by Pingoru
Sep 24, 2024, 04:13 PM UTC

Affected components

Application Access

Update timeline

  1. investigating Sep 24, 2024, 04:13 PM UTC

    We are aware of issues impacting users logging into the Meddbase UK environment. We are currently investigating this as a matter of urgency and will update this page once more information is available.

  2. investigating Sep 24, 2024, 04:55 PM UTC

    We continue to investigate and attempt to resolve the issue impacting the availability of our UK system. Data that was being saved in the system against a Clinical Form should be saved once the system resumes. Other actions will have surfaced an error to users and will have to be repeated. We may need to restart the system to resolve the login issues. Please note that the system will be unavailable during restarts. This page will be updated regularly including once the system becomes operational again.

  3. investigating Sep 24, 2024, 06:19 PM UTC

    Our engineers are currently investigating if the cause of the outage is server related in an attempt to further isolate the issue which will then allow us to take steps to address this. We will keep this page updated and endeavour to give notice if we suspect this issue to continue in the medium term. Our engineers all continue to work beyond typical working hours to ensure the system becomes operational as soon as possible.

  4. identified Sep 24, 2024, 06:51 PM UTC

    We have narrowed the issue to being related to our Microsoft SQL Server Availability groups causing various client systems and our microservices to become unavailable. Our engineers are actively attempting to diagnose the problem and resolve it. There is currently no risk of data loss. The issue appears to be related to the failover cluster that ensures that Meddbase data is replicated between two locations at all times. This has caused the databases to become inaccessible since it is unable to replicate any transactions to the replicas. We are currently unable to see why the replicas are not syncing as the network is fully functional. We continue to investigate the issue and are considering all options possible to bring the systems back online safely.

  5. identified Sep 24, 2024, 07:28 PM UTC

    We have managed to bring our databases back online and will begin restarting customer systems. We hope to restore access to the system shortly.

  6. resolved Sep 24, 2024, 07:35 PM UTC

    We have now restored system access for all UK customers. Our teams will continue to monitor the situation but this incident is now resolved and system use can resume. A postmortem and investigation of this incident will follow.