Skylight experienced a minor incident on January 2, 2022 affecting Application and Hosting, lasting 16m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Jan 02, 2022, 05:37 PM UTC
The Skylight dashboard is inaccessible currently due to a configuration issue. This outage also impacted agent authentication – new authentications from agents will not succeed at the moment. Agents that are already authenticated can continue to report data until the authentication session expires. The data processing pipeline is technically unaffected by this outage as it is hosted on a different provider. However, given that agents are failing to authenticate (and therefore failing to submit traces), we expect this to cause lapses in Skylight data during the outage period.
- monitoring Jan 02, 2022, 05:40 PM UTC
We have manually uploaded a new cert, have completed the migration and are monitoring the situation. The Skylight dashboard (skylight.io) should be immediately accessible. If you are still unable to access the site, please email [email protected] for assistance. Your Skylight agents should resume reporting data once it retries the previously failed authentication request. If this does not occur, you can try restarting your app, which would force the agent to restart and authenticate again. If that still doesn't work, please email [email protected] for further help. We are very sorry for the trouble.
- resolved Jan 02, 2022, 05:57 PM UTC
Our metrics indicate that the agent report rate has recovered to the level before the incident. We believe most customer agents have resumed normal reporting and the issue has been resolved. If you continue to encounter issues, please email [email protected] for assistance. Unfortunately, if your agent was "locked out" from an expired authentication and was unable to report data during the outage, that unreported data will not be available to view on the dashboard. We are truly sorry about this.