Skylight incident

Data Processing Delay

Minor Resolved View vendor source →

Skylight experienced a minor incident on October 10, 2019 affecting Application, lasting 1d 8h. The incident has been resolved; the full update timeline is below.

Started
Oct 10, 2019, 06:07 AM UTC
Resolved
Oct 11, 2019, 02:47 PM UTC
Duration
1d 8h
Detected by Pingoru
Oct 10, 2019, 06:07 AM UTC

Affected components

Application

Update timeline

  1. investigating Oct 10, 2019, 06:07 AM UTC

    As of approximately 3:20 UTC we’ve been unable to process ~3% of all inbound data due to a queuing issue. While the inbound data is still present in the queue, it is not yet certain whether we will be able to process it after the fact. However, we believe that some delay in resolution will not further worsen the situation or harm our chances of data recovery. As it is late in the night for many of us we will get some rest and investigate further in the morning. Our apologies for any inconvenience this may cause you.

  2. identified Oct 10, 2019, 03:12 PM UTC

    We've identified the cause and have resumed processing of the data. We'll update again once we've caught up.

  3. investigating Oct 10, 2019, 03:37 PM UTC

    An unanticipated side-effect from our fix has caused data to stop processing entirely for some workers. We are investigating.

  4. monitoring Oct 10, 2019, 04:28 PM UTC

    We've identified the issue and deployed a hot fix. We're now processing all backlogged data.

  5. monitoring Oct 10, 2019, 06:23 PM UTC

    Nearly all backlogged data has been processed. We are still continuing to process the backlog on the original 3% that was not being correctly handled. We anticipate resolution within the next 12 hours. Normal usage of Skylight should not be affected as 97% of the data for all apps will be present in our UI.

  6. resolved Oct 11, 2019, 02:47 PM UTC

    We’ve caught up with all data processing. We’re continuing to investigate internally to avoid future issues.