FatTail incident

AdBook+ System Performance - Affecting adbook03.fattail.com

Notice Resolved View vendor source →

FatTail experienced a notice incident on November 15, 2023 affecting AdBook+ System Performance, lasting 15h. The incident has been resolved; the full update timeline is below.

Started
Nov 15, 2023, 12:53 AM UTC
Resolved
Nov 15, 2023, 03:53 PM UTC
Duration
15h
Detected by Pingoru
Nov 15, 2023, 12:53 AM UTC

Affected components

AdBook+ System Performance

Update timeline

  1. investigating Nov 15, 2023, 12:53 AM UTC

    We are aware of and investigating an issue affecting only clients who login via adbook03.fattail.com. Users may experience intermittent slow loading screens while logging in or navigating within the application. Thank you for your patience while we work to restore performance. FatTail Support.

  2. investigating Nov 15, 2023, 02:46 AM UTC

    Our engineering teams are continuing to monitor performance and investigate causes of intermittent latency affecting users of adbook03.fattail.com. Users should continue to work as normal in the system. Thank you for your patience as we work to restore full system performance.

  3. resolved Nov 15, 2023, 03:53 PM UTC

    At approximately 3am EST on Wednesday November 15th, following a brief service disruption, our engineers were able to identify and resolve the issue that was causing this incident. The site has been fully performant since the fix was applied. Thank you for your patience while we worked to investigate and resolve the performance issue. A postmortem will be published once our team has finalized their analysis. If you have any concerns or follow up questions please contact us via support.fattail.com. Regards, FatTail Support

  4. postmortem Nov 27, 2023, 09:27 PM UTC

    ### **Summary of Impact** At approximately 7:40 pm EST on Tuesday November 14th, FatTail monitoring detected a system exception on the Prod03 environment. The engineering team immediately began investigating. Some users experienced intermittent slow loading screens, or 502 errors, while logging in or navigating within the application. Service was fully restored at approximately 2:36 am EST on Wednesday November 15th. ### **Root Cause** The incident was caused by a circular reference created while loading campaigns with a certain line item sort criteria. This bad logic was applied to a small subset of campaigns which when loaded resulted in a reset of cache, creating latency for users in the Prod03 environment. ### **Mitigation** A temporary fix was applied to prevent the sort process that caused this incident. FatTail engineers are working on a permanent solution \(DEV- 44974\) to generally separate line item sequencing from the IO Loader which will prevent any future incidents of this nature. ### **Next Steps** DEV-44974 is being prioritized and is currently slated for release no later than January 2024.