Flash incident

Sporadic platform latency

Notice Resolved View vendor source →

Flash experienced a notice incident on August 12, 2023 affecting FlashValet and FlashPARCS, lasting 21h 44m. The incident has been resolved; the full update timeline is below.

Started
Aug 12, 2023, 03:05 PM UTC
Resolved
Aug 13, 2023, 12:50 PM UTC
Duration
21h 44m
Detected by Pingoru
Aug 12, 2023, 03:05 PM UTC

Affected components

FlashValetFlashPARCS

Update timeline

  1. investigating Aug 12, 2023, 03:05 PM UTC

    FLASH OS SYSTEM UPDATE I wanted to provide an update on our platform’s performance, what happened yesterday, and where we stand today: · We discovered Microsoft installed a routine patch on their cloud servers yesterday. · Unfortunately, their patch conflicts with another piece of their own technology, and it is impacting our site performance. · Internally, Flash did not perform any site maintenance in and around this time, nor any maintenance last week. · Believing the Microsoft patch was the root cause of the issues, we engaged with Microsoft and have since rolled back their patch to begin regaining platform stability. · We monitored through the entire night and observed that the site is taking traffic as normal but experiencing occasional latency (taking seconds when it usually takes sub-second). · We remain actively engaged with Microsoft to solve for any remaining latency. · Other than occasional slowness, the platform should be performing as expected. We will continue to provide more updates as we gain additional insight. Please don’t hesitate to contact us and let us know if you are experiencing any issues at your site. Thank you, and our apologies. Anthony Broad-Crawford, CTO

  2. investigating Aug 12, 2023, 07:49 PM UTC

    We are continuing to work with Microsoft to investigate the issue and improve system latency.

  3. monitoring Aug 12, 2023, 10:22 PM UTC

    FLASH OS SYSTEM UPDATE I wanted to update you on our platform's performance and where we stand now. · The system should be fully returning to normal—no further sporadic latency. · Working with our partner, Microsoft, we made several patches to our system to restore typical performance characteristics. · The fixes focused primarily on 3rd party vendors. The team will continue to monitor through the evening and into tomorrow morning. We will provide more updates should we see a change in site performance or latency. However, at this time, that is not anticipated. We will be authoring a detailed incident response document shortly. We currently target no later than Monday 9 am CT for its publication. Please don't hesitate to contact us and let us know if you are experiencing any issues at your site. Thank you, and our apologies. Anthony Broad-Crawford, CTO

  4. resolved Aug 13, 2023, 12:50 PM UTC

    FLASH OS SYSTEM UPDATE The teams have monitored through the evening, night, and early morning without issue. All systems are healthy. We will be authoring a detailed incident response document shortly. We currently target no later than Monday 9 am CT for its publication. Please don't hesitate to contact us and let us know if you are experiencing any issues at your site. Thank you. Anthony Broad-Crawford, CTO

  5. postmortem Aug 14, 2023, 12:54 PM UTC

    **FLASH OS SYSTEM POST-MORTEM** The Flash On-Call and Site Reliability teams monitored through the evening, night, and early morning rush hour today without issue. All systems remain healthy. A detailed incident summary from this weekend is available [here](https://www.dropbox.com/scl/fi/iwcmsgozc1c1k5gy446d4/FlashParking_IncidentRReport_8.11.2023.pdf?rlkey=yim13uiv1oyqmngpeoj6pnpiy&dl=0). Thank you. Anthony Broad-Crawford, CTO