MadKudu incident

Copilot Partial Outage

Major Resolved View vendor source →

MadKudu experienced a major incident on May 27, 2025 affecting msi.madkudu.com and app.madkudu.com, lasting 2d 17h. The incident has been resolved; the full update timeline is below.

Started
May 27, 2025, 02:21 PM UTC
Resolved
May 30, 2025, 07:46 AM UTC
Duration
2d 17h
Detected by Pingoru
May 27, 2025, 02:21 PM UTC

Affected components

msi.madkudu.comapp.madkudu.com

Update timeline

  1. investigating May 27, 2025, 02:21 PM UTC

    One of our provider, LaunchDarkly, is experiencing a major outage: https://status.launchdarkly.com/ We use Launchdarkly for our feature flag infrastructure (i.e. control how features are released). We are looking for a way to fix the situation without having to wait for Launchdarkly to fix their service.

  2. monitoring May 27, 2025, 03:52 PM UTC

    The LaunchDarkly outage triggered an overload on our production database by one of our systems, which in turn prevented the Copilot service from handling some requests. After restarting the database and with LaunchDarkly gradually recovering, the load has decreased and Copilot is now functioning normally. We are closely monitoring the situation and are developing a plan to improve our resilience to similar incidents, which will be implemented in the coming weeks.

  3. resolved May 30, 2025, 07:46 AM UTC

    This incident has been resolved.