FlagSmith incident

Slow response times for Edge API requests

Minor Resolved View vendor source →

FlagSmith experienced a minor incident on July 10, 2023 affecting Edge API, lasting 29m. The incident has been resolved; the full update timeline is below.

Started
Jul 10, 2023, 12:29 PM UTC
Resolved
Jul 10, 2023, 12:58 PM UTC
Duration
29m
Detected by Pingoru
Jul 10, 2023, 12:29 PM UTC

Affected components

Edge API

Update timeline

  1. investigating Jul 10, 2023, 12:29 PM UTC

    We are currently investigating this issue.

  2. identified Jul 10, 2023, 12:44 PM UTC

    We have identified an issue caused by a downstream service which is causing a knock on effect to our performance. We are currently deploying a change to remove the downstream service.

  3. monitoring Jul 10, 2023, 12:50 PM UTC

    The downstream service has been successfully removed. Response times have returned to normal. We are continuing to monitor the situation.

  4. resolved Jul 10, 2023, 12:58 PM UTC

    This incident has been resolved.

  5. postmortem Jul 10, 2023, 01:23 PM UTC

    ## Timeline At 12:15pm UTC, we were notified of increased response times on a number of our Edge API endpoints. Investigation showed nothing immediately obvious but we suspected that it could be caused by Sentry, our APM tool. We set about removing the Sentry initialisation from our code and deployed it as soon as we could. At 12:48pm UTC, this change was deployed and we observed the response times decrease immediately. At 12:52pm UTC our monitoring confirmed that the average response time had returned to normal. ## Next Steps * Look into improvements to reduce / remove the impact of Sentry issues on our Edge API. * Decrease the shutdown timeout of the Sentry SDK. * Look at using [Sentry relay](https://docs.sentry.io/product/relay) to remove the impact on core Edge API services. ‌ * Add integration tests to simulate performance degradation / outages from all downstream services.