GraphCDN Outage History

GraphCDN had 8 outages in the last 2 years totaling 69h 43m of downtime — averaging 0.3 incidents per month.

There were 8 GraphCDN outages since July 22, 2024 totaling 69h 43m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://swl3fmh6tryh.statuspage.io

Major August 20, 2025

Domain resolution

Detected by Pingoru: Aug 20, 2025, 09:35 PM UTC
Resolved: Aug 20, 2025, 10:50 PM UTC
Duration: 1h 14m

Affected: GraphQL Edge CachingGraphQL MetricsGraphQL Rate LimitingGraphQL Developer PortalsUser APIAdmin API

Timeline · 7 updates

investigating Aug 20, 2025, 09:35 PM UTC

We are currently investigating this issue.
investigating Aug 20, 2025, 09:36 PM UTC

We are continuing to investigate this issue.
investigating Aug 20, 2025, 09:57 PM UTC

Noticing elevated error rates across our KV stores.
monitoring Aug 20, 2025, 10:42 PM UTC

A fix has been implemented and we are monitoring the results.
monitoring Aug 20, 2025, 10:43 PM UTC

Related https://www.fastlystatus.com/incident/377843
resolved Aug 20, 2025, 10:50 PM UTC

This incident has been resolved.
postmortem Aug 21, 2025, 04:32 PM UTC

## Summary On August 20, 2025, our services experienced elevated errors on our KV stores due to a third-party vendor issue. The incident was caused by a faulty enhancement deployment on the vendor's platform, which was subsequently rolled back to restore service. ## Root Cause Third-party vendor deployed a faulty enhancement that caused backend errors for a subset of customers, primarily in Singapore and California regions. ## Resolution Vendor engineering team rolled back the problematic deployment and confirmed service restoration. ## Lessons Learned * Vendor responded quickly with clear communication * Need better early detection of vendor-side issues * Consider additional monitoring for vendor dependencies ‌ Related: [https://www.fastlystatus.com/incident/377843](https://www.fastlystatus.com/incident/377843).

Read the full incident report →

Major June 12, 2025

Increased Errors with Multiple Third Party Service Provider

Detected by Pingoru: Jun 12, 2025, 07:28 PM UTC
Resolved: Jun 12, 2025, 08:43 PM UTC
Duration: 1h 14m

Affected: DashboardGraphQL Edge CachingGraphQL MetricsGraphQL Rate LimitingGraphQL Developer PortalsUser APIAdmin API

Timeline · 6 updates

investigating Jun 12, 2025, 07:28 PM UTC

We are currently investigating this issue.
investigating Jun 12, 2025, 07:30 PM UTC

We are continuing to investigate this issue.
investigating Jun 12, 2025, 07:33 PM UTC

We are continuing to investigate this issue.
investigating Jun 12, 2025, 07:35 PM UTC

We are continuing to investigate this issue.
monitoring Jun 12, 2025, 08:23 PM UTC

Error rates are decreasing, and we are actively working with our vendors to fully restore services.
resolved Jun 12, 2025, 08:43 PM UTC

This incident has been resolved.

Read the full incident report →

Minor February 20, 2025

Elevated errors on Dashboard

Detected by Pingoru: Feb 20, 2025, 05:52 PM UTC
Resolved: Feb 22, 2025, 03:23 AM UTC
Duration: 1d 9h

Affected: DashboardAdmin API

Timeline · 4 updates

investigating Feb 20, 2025, 05:52 PM UTC

We are currently investigating this issue.
monitoring Feb 21, 2025, 10:54 AM UTC

A fix has been implemented and we are monitoring the results.
monitoring Feb 21, 2025, 06:55 PM UTC

We are continuing to monitor for any further issues.
resolved Feb 22, 2025, 03:23 AM UTC

This incident has been resolved.

Read the full incident report →

Minor January 7, 2025

Dashboard API Experiencing Issues

Detected by Pingoru: Jan 07, 2025, 07:00 AM UTC
Resolved: Jan 08, 2025, 11:00 AM UTC
Duration: 1d 4h

Affected: DashboardUser API

Timeline · 5 updates

investigating Jan 07, 2025, 11:19 AM UTC

We are currently investigating this issue.
identified Jan 07, 2025, 01:44 PM UTC

We are currently experiencing database performance issues and are actively working on updates to resolve the situation. Our team is implementing changes to restore normal service as quickly as possible. We will continue to provide updates as we progress.
identified Jan 07, 2025, 02:49 PM UTC

We have identified potential optimizations and are in the process of testing them to ensure their effectiveness. While the main degradation has calmed down, we’re taking precautionary steps to apply patches and ensure the system's stability and safety moving forward.
monitoring Jan 07, 2025, 03:48 PM UTC

We have applied new database indexes that should improve performance. We are still working on rolling out a production deployment to include additional API improvements.
resolved Jan 08, 2025, 11:00 AM UTC

This incident has been resolved.

Read the full incident report →

Minor October 11, 2024

High Error Rates on Dashboard

Detected by Pingoru: Oct 11, 2024, 07:32 PM UTC
Resolved: Oct 11, 2024, 09:25 PM UTC
Duration: 1h 53m

Affected: DashboardUser APIAdmin API

Timeline · 3 updates

identified Oct 11, 2024, 07:32 PM UTC

The issue has been identified and a fix is being implemented.
identified Oct 11, 2024, 08:16 PM UTC

We are continuing to work on a fix for this issue.
resolved Oct 11, 2024, 09:25 PM UTC

This incident has been resolved.

Read the full incident report →

Minor October 10, 2024

High Error Rates Detected on Dashboard

Detected by Pingoru: Oct 10, 2024, 11:29 PM UTC
Resolved: Oct 11, 2024, 12:06 AM UTC
Duration: 36m

Affected: DashboardUser APIAdmin API

Timeline · 3 updates

investigating Oct 10, 2024, 11:29 PM UTC

We are currently investigating this issue.
identified Oct 10, 2024, 11:49 PM UTC

The issue has been identified and a fix is being implemented.
resolved Oct 11, 2024, 12:06 AM UTC

This incident has been resolved.

Read the full incident report →

Major August 29, 2024

Issues with GraphQL Metrics ingesting new data

Detected by Pingoru: Aug 29, 2024, 09:47 AM UTC
Resolved: Aug 29, 2024, 10:34 AM UTC
Duration: 46m

Affected: GraphQL Metrics

Timeline · 4 updates

investigating Aug 29, 2024, 09:10 AM UTC

We are currently looking into an issue with our GraphQL Metrics systems, which prevents new data to show up on the dashboard. All other systems are working as expected and our GraphQL Edge Caching and GraphQL Rate Limiting systems are not affected.
identified Aug 29, 2024, 09:47 AM UTC

The issue has been identified as an issue with one of our infrastructure providers. and their ClickPipes service. They have opened an investigation and updated their status page at https://status.clickhouse.com/incidents/f6j8dfnyy6dn
monitoring Aug 29, 2024, 10:13 AM UTC

We are seeing signs of recovery, and new data is again being ingested into our GraphQL Metrics systems. We are monitoring all systems and are in touch with our infrastructure provider for further updates. Please note that the GraphQL Metrics system might lag behind near real-time for a while as queued updates get processed.
resolved Aug 29, 2024, 10:34 AM UTC

This issue has been resolved. We'll continue monitoring all services, but we don't expect any further issues.

Read the full incident report →

Major July 22, 2024

Issues with Purging API

Detected by Pingoru: Jul 22, 2024, 05:40 AM UTC
Resolved: Jul 22, 2024, 08:07 AM UTC
Duration: 2h 27m

Affected: Purging API

Timeline · 6 updates

investigating Jul 22, 2024, 05:40 AM UTC

We are currently looking into an issue with the Purging API.
identified Jul 22, 2024, 07:32 AM UTC

The team has identified the issue and is currently implementing a fix.
monitoring Jul 22, 2024, 07:38 AM UTC

A fix has been implemented and the Purging API is working as expected again. We are monitoring all systems to make sure they are working as expected.
monitoring Jul 22, 2024, 07:45 AM UTC

We are continuing to monitor for any further issues.
resolved Jul 22, 2024, 08:07 AM UTC

This incident has been resolved.
postmortem Jul 23, 2024, 06:36 PM UTC

During a routine employee offboarding, we revoked that employee’s access to Fastly. Revoking their access to Fastly also revoked all access tokens that engineer created. Unfortunately, this included the central API token all our systems use to communicate with the Fastly API. This had two immediate impacts: 1. Purging started failing silently: Stellate’s purging API kept returning successful responses even though data would not be evicted from the cache. 2. Service configuration updates failing silently: Service configuration updates appeared to persist even though they were not updated in the CDN. As part of the incident response, we switched the central Fastly API token to a new token owned by a shared engineering account. Further, we will work on gaining better visibility and alerting on failure conditions with the purging API, as well as audit all tokens in use by our services to ensure they are not owned by individual engineers.

Read the full incident report →