FullStory incident

Data capture outage for orgs on EU data center

Notice Resolved View vendor source →

FullStory experienced a notice incident on September 15, 2023 affecting Data Capture - Web and Data Capture - Native Mobile and 1 more component, lasting —. The incident has been resolved; the full update timeline is below.

Started
Sep 15, 2023, 07:33 PM UTC
Resolved
Sep 15, 2023, 07:33 PM UTC
Duration
Detected by Pingoru
Sep 15, 2023, 07:33 PM UTC

Affected components

Data Capture - WebData Capture - Native MobileAPI

Update timeline

  1. resolved Sep 15, 2023, 07:33 PM UTC

    End user sessions and data were not captured from 2:25pm ET to 3:05pm ET for accounts on our EU data center. All missing activity during this time period is non-recoverable and may impact Metrics, Funnels, Dashboards, and Conversions. During this time, native mobile builds could also have failed. Re-running the build will properly upload the assets. This issue did not impact any FullStory accounts utilizing our US data center. (Your FullStory URLs would start with app.eu1.fullstory.com if you are utilizing our EU data center.)

  2. postmortem Sep 20, 2023, 07:04 PM UTC

    # Postmortem 2023/09/13 EU1 Data Capture Outage Due to an infrastructure issue caused by a sudden spike of unanticipated traffic, end user session data was not captured for both web and mobile sessions on 2023-09-13 between 18:15 UTC and 19:05 UTC, for all Orgs hosted in our European \(EU1\) data center. Orgs hosted on our North American \(NA1\) data center were not impacted. Any existing sessions being captured during the impacted timeframe may have gaps in intermediate pages, resulting in missing segments of time in playback, and any new sessions \(web or mobile\) started during this time may have been dropped. Analytics features that rely on this session and event data are also impacted, which means that there may be missing data points in metrics, funnels, dashboards, and conversions. This postmortem details the impact on our customers, the root cause of the issue, how we addressed the problem, and the steps we're taking to prevent this and similar types of issues in the future. # Customer Impact All Orgs on our EU data center were impacted by this incident. You can check if your Org was impacted by seeing if your [Org’s ID](https://help.fullstory.com/hc/en-us/articles/360047075853-How-do-I-find-my-FullStory-Org-Id-) ends in “-eu1”. Any web and mobile capture data coming to FullStory between 18:15 UTC and 19:05 UTC was not captured and is not recoverable. # Root Cause On 2023-09-13 at 18:15 UTC our EU data center experienced an unexpected spike in unanticipated traffic. Our backend data capture service was unable to scale fast enough to accommodate the traffic increase and the service eventually crashed. The service would then crash again on attempted restarts as it could not scale fast enough to handle the incoming traffic. # Resolution Our on-call operations team was immediately alerted and intervened to resolve this issue after it presented itself, including scaling up the data capture service manually to resume proper operation of our data capture service. # Process Changes and Prevention So far we have: * Scaled up existing resources for our data capture service to handle the new volume of traffic * Updated our infrastructure so that the data capture service will be able to scale up faster in the future To prevent a recurrence of this incident we will be: * Modifying our service scaling policies even more so that we can handle similar spikes like this more smoothly * Improving our monitoring and alerting so we can address these kinds of issues more quickly ‌ We deeply regret this incident and invite any FullStory customer who was materially affected to contact [[email protected]](mailto:[email protected]). We stand by ready to fully address all of your concerns.