eMerchantPay incident

Genesis Partial Outage

Minor Resolved View vendor source →

eMerchantPay experienced a minor incident on March 26, 2018, lasting 2d 3h. The incident has been resolved; the full update timeline is below.

Started
Mar 26, 2018, 11:40 AM UTC
Resolved
Mar 28, 2018, 03:14 PM UTC
Duration
2d 3h
Detected by Pingoru
Mar 26, 2018, 11:40 AM UTC

Update timeline

  1. investigating Mar 26, 2018, 11:40 AM UTC

    Dear Merchant, Please kindly note that we are experiencing a partial outage with our gateway, which is currently being investigated. We will inform you once the issue is resolved. If you need further assistance or information, do not hesitate to contact us at [email protected]. Kind Regards, eMerchantpay Tech Support

  2. identified Mar 26, 2018, 12:30 PM UTC

    Dear Merchant, We have a severe network outage between data centers, BER is totally cut off from both AMS and FRA DCs. We have bootstrapped the DB cluster in BER PROD, and we are currently processing with limited capabilities, with one DC only, the BER one. We are confident we can support the current volume via one DC only. Updates on the networks restoration along with full processing capabilities will follow soon as we resolve. If you need further assistance or information, do not hesitate to contact us at [email protected]. Best regards, eMerchantPay Tech Support Team

  3. monitoring Mar 28, 2018, 12:26 AM UTC

    Dear merchants, We are still processing with 1 DC (BER DC). All services should be stable and no processing issues should be encountered, despite the partial outage, as noted previously. We are actively working on restoring services in the AMS DC, and we expect to restore AMS DC services tomorrow, and continue processing with both our DCs in the standard active-active setup we have. We will notify about this tomorrow once finalized. Additionally, we have been extremely busy the last 2 days to restore data from the AMS DC DB cluster, which diverged from the BER DC cluster (a rare split brain scenario) during the network outage. We have implemented complex code and scripts to extract only the AMS-specific diverged data, process it, and then import it in the main DB cluster - so that no transactions, web payment form payments, etc are lost. This was executed and finished successfully an hour ago. As a result, you can now use the reconciliation services, UI, etc - all the gateway services for the previously missing data are fully available, as missing data has been imported. We advise you to not void/refund transactions in the unfortunate timeframe, as we made sure these transactions are batched in settlements with the schemes, so there should not be any financial impact. Regarding the incident itself, it was a fire in BER, that cut off all of our connectivity to both the AMS DC and FRA DC, both the primary and backup internet lines (seems lots of telco transits were affected by this). Our DC personnel have restored connectivity from BER DC to AMS DC and FRA DC via a reserved line, and are working on restoring the common transits tomorrow, along with the corresponding authorities. We apologize for the issues that have happened on 26.03, Monday. We will keep you informed as noted above. Best regards, eMerchantPay Tech Support Team

  4. resolved Mar 28, 2018, 03:14 PM UTC

    Dear Merchants, We have been successful at restoring full capacity for our gateway platform API and applications. As of now, we are processing with full throttle - both our BER DC and AMS DC are fully functional and all internal and external services up and running. Thanks for the patience. If you have any questions or concerns please don't hesitate to contact us at [email protected]. Best regards, [email protected]