WEDO experienced a major incident on September 21, 2021 affecting WEDO, lasting 4h 16m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Sep 21, 2021, 11:40 AM UTC
Some users can't login to WEDO.
- monitoring Sep 21, 2021, 11:42 AM UTC
We've identified the issue. It's related to our external service that processes our billing. We've mitigated the issue.
- monitoring Sep 21, 2021, 12:02 PM UTC
The fix has been pushed and we are waiting and monitoring the situation to be sure it's fixed. Currently we've blocked the creation of new users until the issue is totally fixed. We are sorry for the inconvenience.
- resolved Sep 21, 2021, 03:20 PM UTC
We've deployed a final fix. It's now possible to create users again.
- postmortem Sep 24, 2021, 11:55 AM UTC
**Summary:** Users with a subscription could not log in to the system. The failure of the login page was due to an inaccessible outside API. During the login process, the billing and subscription information is fetched from the billing system. The billing system was unavailable and caused the login to fail. **Customer Impact:** Users could not log in to the platform. **Timeline:** 13h03: users start reporting that they can’t login to the support team 13h07: after several reports, the support team reports the incident to the dev team 13h15: after trying to understand the incident, the problem is reported to the head of the dev team 13h35: the source of the login failures is identified 13h55: a patch is applied to the system to mitigate the issue, the login is restored 15h10: a fix is deployed to correctly handle the issue **Contributing factors:** An external system was not responding \(the billing system\). **Lessons Learned:** We should always implement our code while expecting the third party to have some issues. A fallback should always be implemented. **Action items:** * Review the code to check all requests to external API’s * Reduce the number of calls to external API’s