GraphCDN incident
Stellate Services unavailable because of Cloudflare Worker KV outage
GraphCDN experienced a critical incident on June 7, 2023 affecting GraphQL Edge Caching and GraphQL Metrics and 1 more component, lasting 44m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jun 07, 2023, 07:01 PM UTC
We are looking into an issue with Stellate right now. We will update this incident as we have more data available.
- monitoring Jun 07, 2023, 07:07 PM UTC
As far as we can tell, Cloudflare Workers KV service, which we depend on, was having a outage of about 5 to 10 minutes. They seem to be back up and running again. We are monitoring the situation and will update our status page as needed.
- monitoring Jun 07, 2023, 07:18 PM UTC
All services are back up and running again. We are monitoring the status of our services as well as Cloudflare Worker KV store.
- resolved Jun 07, 2023, 07:46 PM UTC
Cloudflare posted an update on their status page and marked the incident that caused this incident as resolved. See https://www.cloudflarestatus.com/incidents/1mj9jch1tqf9 for their update.
- postmortem Jun 08, 2023, 10:48 AM UTC
* Stellate currently relies on CloudFlare services for parts of our offerings. * Cloudflare had a global outage of their KV store for ~10 minutes on June 7th, from 6.51 pm to 7.01 pm. They provide a summary of this incident on their own status page at [https://www.cloudflarestatus.com/incidents/1mj9jch1tqf9](https://www.cloudflarestatus.com/incidents/1mj9jch1tqf9). * Any traffic that resulted in cache misses or cache passes triggered an HTTP/500 error page during that time frame. Traffic directly handled by the edge cache \(i.e., cache hits\) was not affected. * ~30% of traffic resulted in cache hits and was served correctly. * ~70% of traffic resulted in cache misses or passes; these requests returned an HTTP/500 error. * We are currently working on a larger infrastructure improvement that will remove the dependency on Cloudflare Worker KV. * Additionally, we will review all possible failure points that could make Stellate core services inaccessible \(in the event of a third-party outage\) and investigate options for additional redundancies for those services.