Grafana Outage History

Grafana partial outage · 1 active incident View live status →

Grafana had 77 outages in the last 2 years totaling 279h 30m of downtime — averaging 3.2 incidents per month.

There were 77 Grafana outages since February 25, 2026 totaling 279h 30m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.grafana.com

Notice June 12, 2026

Brief Loki Prod-012-eu-west-2 Disruption

Detected by Pingoru
Jun 12, 2026, 09:54 PM UTC
Resolved
Jun 12, 2026, 06:30 PM UTC
Duration
Timeline · 1 update
  1. resolved Jun 12, 2026, 09:54 PM UTC

    Our team had discovered a read issue around 19:35-20:08 UTC. Impact at the time would have provided errors similar to context deadline exceeded (DatasourceError response). This has since been resolved, and should not have caused any data loss, only a short query disruption.

Read the full incident report →

Major June 10, 2026

Grafana Dashboards page not displaying when set to ‘View by Folders’

Detected by Pingoru
Jun 10, 2026, 10:49 AM UTC
Resolved
Jun 11, 2026, 05:36 AM UTC
Duration
18h 46m
Affected: AWS Australia - prod-ap-southeast-2AWS Brazil - prod-sa-east-1AWS Canada - prod-ca-east-0AWS Germany - prod-eu-west-2AWS Germany - prod-eu-west-4AWS India - prod-ap-south-1AWS Japan - prod-ap-northeast-0AWS UAE - prod-me-central-1AWS Singapore - prod-ap-southeast-1AWS Sweden - prod-eu-north-0AWS US East - prod-us-east-0AWS US East - prod-us-east-2AWS US West - prod-us-west-0AWS Australia - prod-au-southeast-1AWS UK - prod-gb-south-1AWS Ireland - prod-eu-west-6Azure US Central - us-central2AWS Switzerland - prod-eu-central-0Azure Netherlands - prod-eu-west-3GCP Australia - prod-au-southeast-0GCP Belgium - prod-eu-west-0GCP Brazil - prod-sa-east-0GCP India - prod-ap-south-0GCP Singapore - prod-ap-southeast-0GCP UK - prod-gb-south-0GCP US Central - prod-us-central-0GCP US Central - prod-us-central-3GCP US Central - prod-us-central-4GCP US East - prod-us-east-1play.grafana.orgFederal Cloud - AWS US Gov West
Timeline · 2 updates
  1. investigating Jun 10, 2026, 10:49 AM UTC

    We’re currently investigating an issue affecting The Grafana Dashboards page. When set to view by folders, is currently experiencing an issue where no dashboards are shown. Our team is working on fixing the problem. In the meantime, switching to ‘View as list’ allows access to dashboards as usual”.

  2. resolved Jun 11, 2026, 05:36 AM UTC

    This incident has been resolved.

Read the full incident report →

Major June 9, 2026

Investigating Issues with Data Source-Managed Alerting

Detected by Pingoru
Jun 09, 2026, 11:11 PM UTC
Resolved
Jun 10, 2026, 01:11 PM UTC
Duration
14h
Affected: AWS Australia - prod-ap-southeast-2: Alertmanager and Rules Configuration APIAWS Australia - prod-ap-southeast-2: AlertmanagerAWS Brazil - prod-sa-east-1: Alertmanager and Rules Configuration APIAWS Brazil - prod-sa-east-1: AlertmanagerAWS Canada - prod-ca-east-0: Alertmanager and Rules Configuration APIAWS Canada - prod-ca-east-0: AlertmanagerAWS Germany - prod-eu-west-2: Alertmanager and Rules Configuration APIAWS Germany - prod-eu-west-2: AlertmanagerAWS UAE - prod-me-central-1: Alertmanager and Rules Configuration APIAWS UAE - prod-me-central-1: AlertmanagerAWS India - prod-ap-south-1: Alertmanager and Rules Configuration APIAWS Japan - prod-ap-northeast-0: Alertmanager and Rules Configuration APIAWS Singapore - prod-ap-southeast-1: Alertmanager and Rules Configuration APIAWS Singapore - prod-ap-southeast-1: AlertmanagerAWS Sweden - prod-eu-north-0: Alertmanager and Rules Configuration APIAWS Sweden - prod-eu-north-0: AlertmanagerAWS US East - prod-us-east-0: Alertmanager and Rules Configuration APIAWS US East - prod-us-east-0: AlertmanagerAWS US West - prod-us-west-0: Alertmanager and Rules Configuration APIAWS US West - prod-us-west-0: AlertmanagerAWS Ireland - prod-eu-west-6AWS Switzerland - prod-eu-central-0Azure Netherlands - prod-eu-west-3: Alertmanager and Rules Configuration APIAzure Netherlands - prod-eu-west-3: AlertmanagerAzure US Central - us-central2: Alertmanager and Rules Configuration APIAzure US Central - us-central2: AlertmanagerGCP Australia - prod-au-southeast-0: Alertmanager and Rules Configuration APIGCP Australia - prod-au-southeast-0: AlertmanagerGCP Belgium - prod-eu-west-0: Alertmanager and Rules Configuration APIGCP Belgium - prod-eu-west-0: AlertmanagerGCP Brazil - prod-sa-east-0: Alertmanager and Rules Configuration APIGCP Brazil - prod-sa-east-0: AlertmanagerGCP India - prod-ap-south-0: Alertmanager and Rules Configuration APIGCP India - prod-ap-south-0: AlertmanagerGCP Singapore - prod-ap-southeast-0: Alertmanager and Rules Configuration APIGCP Singapore - prod-ap-southeast-0: AlertmanagerGCP UK - prod-gb-south-0: Alertmanager and Rules Configuration APIGCP UK - prod-gb-south-0: AlertmanagerGCP US Central - prod-us-central-0: Alertmanager and Rules Configuration APIGCP US Central - prod-us-central-0: Alertmanager
Timeline · 3 updates
  1. investigating Jun 09, 2026, 11:11 PM UTC

    We are currently investigating an issue affecting data source-managed alerting management functionality in Grafana Cloud. Customers may experience problems viewing, creating, updating, or managing alerts through Grafana when using data source-managed alerting. This issue is limited to alert management functionality within Grafana. Alert evaluation and backend alerting services continue to operate normally. Direct alerting APIs for Mimir and Loki remain fully operational and are unaffected. Grafana-managed alerting is not impacted. We identified this issue at approximately 20:45 UTC and are actively working on a resolution. We will provide additional updates as more information becomes available. Workaround: Customers can continue to use the direct Mimir and Loki alerting APIs while we work to restore normal functionality.

  2. monitoring Jun 10, 2026, 10:57 AM UTC

    Our team has implemented a fix and we are currently monitoring the results of this.

  3. resolved Jun 10, 2026, 01:11 PM UTC

    We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

Read the full incident report →

Minor June 8, 2026

IRM Degraded Performance

Detected by Pingoru
Jun 08, 2026, 10:45 AM UTC
Resolved
Jun 08, 2026, 08:30 PM UTC
Duration
9h 44m
Affected: GCP Belgium - prod-eu-west-0: Alertmanager and Rules Configuration APIGCP US Central - prod-us-central-0: Alertmanager and Rules Configuration API
Timeline · 8 updates
  1. investigating Jun 08, 2026, 10:45 AM UTC

    We are experiencing access issues in IRM as there are elevated 500 API responses in prod-us-central-0.

  2. investigating Jun 08, 2026, 10:47 AM UTC

    We are continuing to investigate this issue.

  3. identified Jun 08, 2026, 11:15 AM UTC

    The issue has been identified and a fix is being implemented.

  4. identified Jun 08, 2026, 12:26 PM UTC

    We are continuing to work on a fix for this issue.

  5. identified Jun 08, 2026, 01:14 PM UTC

    The degraded performance is about labels, and we have seen this degradation in more regions.

  6. identified Jun 08, 2026, 03:21 PM UTC

    We are continuing to work on a fix for this. To further clarify, this issue is not about accessing IRM or alert ingestion/notification/delivery, but rather with handling labels.

  7. monitoring Jun 08, 2026, 05:11 PM UTC

    We've released a fix to the IRM app that should restore service for affected customers with issues related to labels. Thanks for your patience while investigating. We're continuing to monitor as we confirm the resolution in place.

  8. resolved Jun 08, 2026, 08:30 PM UTC

    This incident has been resolved.

Read the full incident report →

Major June 7, 2026

Brief Rule Evaluation Failures in prod-eu-west-3

Detected by Pingoru
Jun 07, 2026, 01:39 AM UTC
Resolved
Jun 12, 2026, 07:52 PM UTC
Duration
5d 18h
Affected: Azure Netherlands - prod-eu-west-3: Ingestion
Timeline · 7 updates
  1. investigating Jun 07, 2026, 01:39 AM UTC

    From 00:20:00 to 00:27:00 and again 00:32:00 to 00:38:00 there were brief spikes in rule evaluation failures. Engineers are investigating.

  2. investigating Jun 07, 2026, 02:46 AM UTC

    Intermittent spikes in rule evaluations continuing.

  3. investigating Jun 07, 2026, 03:50 AM UTC

    We are continuing to investigate this issue.

  4. investigating Jun 07, 2026, 06:00 AM UTC

    We’re making ongoing progress on the investigation alongside our upstream provider.

  5. monitoring Jun 07, 2026, 11:00 AM UTC

    The incident has been mitigated, and services are operating normally. We are currently monitor the service to ensure full stability.

  6. monitoring Jun 08, 2026, 09:36 PM UTC

    The incident has been mitigated, and services are operating normally. We continue to monitor the service to ensure full stability.

  7. resolved Jun 12, 2026, 07:52 PM UTC

    This incident has been resolved. Thank you for your patience.

Read the full incident report →

Critical June 5, 2026

Permissions Issues with IRM

Detected by Pingoru
Jun 05, 2026, 04:11 PM UTC
Resolved
Jun 05, 2026, 09:35 PM UTC
Duration
5h 24m
Affected: Incident Management and Response (IRM)
Timeline · 8 updates
  1. investigating Jun 05, 2026, 04:11 PM UTC

    We are currently investigating an issue impacting permissions for IRM. As a result, users are not currently getting paged. We will provide updates as they become available.

  2. investigating Jun 05, 2026, 04:52 PM UTC

    Our engineering team is still investigating this issue. We do not have any new information to share at this time, but will continue to provide timely updates.

  3. identified Jun 05, 2026, 05:14 PM UTC

    We've identified an issue with RBAC syncing for a small number of customers and are working on a fix to restore permission services for those affected.

  4. monitoring Jun 05, 2026, 06:00 PM UTC

    A fix is being deployed now, and we are monitoring the progress.

  5. monitoring Jun 05, 2026, 06:40 PM UTC

    We've identified an earlier regression in one of our recent code changes that was affecting resolution of our previous fix. We're deploying this change now and applying a hot fix in the interim to restore access quickly

  6. monitoring Jun 05, 2026, 07:34 PM UTC

    A fix has been released to prod and rolling out across the fleet for IRM, restoring access to affected customers. Thanks for your patience through this work. We're continuing to monitor to confirm we've returned to a steady state.

  7. monitoring Jun 05, 2026, 08:10 PM UTC

    Continuing to monitor progress. Most customers affected should have all services restored, with a few remaining customers receiving updates as the rollout finishes out. Thanks again for your patience.

  8. resolved Jun 05, 2026, 09:35 PM UTC

    This incident has been resolved.

Read the full incident report →

Major June 5, 2026

Silences not Working as Expected

Detected by Pingoru
Jun 05, 2026, 02:02 PM UTC
Resolved
Jun 05, 2026, 03:01 PM UTC
Duration
58m
Affected: AWS Australia - prod-ap-southeast-2: AlertmanagerAWS Brazil - prod-sa-east-1: AlertmanagerAWS Canada - prod-ca-east-0: AlertmanagerAWS Germany - prod-eu-west-2: AlertmanagerAWS UAE - prod-me-central-1: AlertmanagerAWS Singapore - prod-ap-southeast-1: AlertmanagerAWS Sweden - prod-eu-north-0: AlertmanagerAWS US East - prod-us-east-0: AlertmanagerAWS US West - prod-us-west-0: AlertmanagerAzure Netherlands - prod-eu-west-3: AlertmanagerAzure US Central - us-central2: AlertmanagerGCP Australia - prod-au-southeast-0: AlertmanagerGCP Belgium - prod-eu-west-0: AlertmanagerGCP Brazil - prod-sa-east-0: AlertmanagerGCP India - prod-ap-south-0: AlertmanagerGCP Singapore - prod-ap-southeast-0: AlertmanagerGCP UK - prod-gb-south-0: AlertmanagerGCP US Central - prod-us-central-0: Alertmanager
Timeline · 2 updates
  1. identified Jun 05, 2026, 02:02 PM UTC

    We have identified an issue causing Silences to not work as expected in the Cloud (Mimir) Alertmanager. Grafana Alertmanager is working ok, this is only affecting Data source-managed alerts.

  2. resolved Jun 05, 2026, 03:01 PM UTC

    This incident has been resolved.

Read the full incident report →

Major June 4, 2026

Grafana Assistant Skills Page Blank

Detected by Pingoru
Jun 04, 2026, 04:44 PM UTC
Resolved
Jun 04, 2026, 06:28 PM UTC
Duration
1h 44m
Affected: AWS Australia - prod-ap-southeast-2AWS Brazil - prod-sa-east-1AWS Canada - prod-ca-east-0AWS Germany - prod-eu-west-2AWS Germany - prod-eu-west-4AWS India - prod-ap-south-1AWS Ireland - prod-eu-west-6AWS Japan - prod-ap-northeast-0AWS UAE - prod-me-central-1AWS Singapore - prod-ap-southeast-1AWS Sweden - prod-eu-north-0AWS US East - prod-us-east-0AWS US East - prod-us-east-2AWS US West - prod-us-west-0AWS Australia - prod-au-southeast-1AWS UK - prod-gb-south-1Azure Netherlands - prod-eu-west-3Azure US Central - us-central2GCP Australia - prod-au-southeast-0GCP Belgium - prod-eu-west-0GCP Brazil - prod-sa-east-0GCP India - prod-ap-south-0GCP Singapore - prod-ap-southeast-0GCP UK - prod-gb-south-0GCP US Central - prod-us-central-0GCP US Central - prod-us-central-3GCP US Central - prod-us-central-4GCP US East - prod-us-east-1AWS Ireland - prod-eu-west-6AWS Ireland - prod-eu-west-6AWS Switzerland - prod-eu-central-0
Timeline · 4 updates
  1. investigating Jun 04, 2026, 04:44 PM UTC

    We are currently investigating an issue affecting the Skills page of Grafana Assistant. Impacted deployments will encounter a blank screen when attempting to access this page. At this time, we have observed partial impact in the us-east-0 and us-central-0 regions, and will provide an update here if the scope of impact expands.

  2. investigating Jun 04, 2026, 04:56 PM UTC

    We are continuing to investigate this issue.

  3. identified Jun 04, 2026, 05:00 PM UTC

    The issue has been identified, and we are working on a fix.

  4. resolved Jun 04, 2026, 06:28 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor June 3, 2026

K6 Test Runs Degraded Performance

Detected by Pingoru
Jun 03, 2026, 08:40 PM UTC
Resolved
Jun 04, 2026, 05:47 AM UTC
Duration
9h 7m
Affected: Cloud Test RunsCloud Output Test Runs
Timeline · 3 updates
  1. investigating Jun 03, 2026, 08:40 PM UTC

    We are currently investigating an issue causing k6 test runs to take longer than expected to complete, or to time out within Grafana Cloud.

  2. monitoring Jun 03, 2026, 09:23 PM UTC

    We have applied a fix, and are monitoring the results.

  3. resolved Jun 04, 2026, 05:47 AM UTC

    This incident has been resolved.

Read the full incident report →

Major June 3, 2026

Synthetic Scripted/Browser checks failure

Detected by Pingoru
Jun 03, 2026, 11:38 AM UTC
Resolved
Jun 03, 2026, 06:22 PM UTC
Duration
6h 44m
Affected: AWS Australia - prod-au-southeast-1: APIAWS Australia - prod-au-southeast-1: Public ProbesAWS Brazil - prod-sa-east-1: APIAWS Brazil - prod-sa-east-1: Public ProbesAWS Canada - prod-ca-east-0: APIAWS Canada - prod-ca-east-0: Public ProbesAWS Germany - prod-eu-west-2: APIAWS Germany - prod-eu-west-2: Public ProbesAWS India - prod-ap-south-1: APIAWS India - prod-ap-south-1: Public ProbesAWS Japan - prod-ap-northeast-0: APIAWS Japan - prod-ap-northeast-0: Public ProbesAWS Singapore - prod-ap-southeast-1: APIAWS Singapore - prod-ap-southeast-1: Public ProbesAWS Sweden - prod-eu-north-0: APIAWS Sweden - prod-eu-north-0: Public ProbesAWS US East - prod-us-east-0: APIAWS US East - prod-us-east-0: Public ProbesAWS US East - prod-us-east-3: APIAWS US East - prod-us-east-3: Public ProbesAWS US West - prod-us-west-0: APIAWS US West - prod-us-west-0: Public ProbesAzure Netherlands - prod-eu-west-3: APIAzure Netherlands - prod-eu-west-3: Public ProbesAzure US Central - prod-us-central2: APIAzure US Central - prod-us-central2: Public ProbesAzure US Central - prod-us-central7: APIAzure US Central - prod-us-central7: Public ProbesGCP Australia - prod-au-southeast-0: APIGCP Australia - prod-au-southeast-0: Public ProbesGCP Belgium - prod-eu-west-0: APIGCP Belgium - prod-eu-west-0: Public ProbesGCP Brazil - prod-sa-east-0: APIGCP Brazil - prod-sa-east-0: Public ProbesGCP India - prod-ap-south-0: APIGCP India - prod-ap-south-0: Public ProbesGCP Singapore - prod-ap-southeast-0: APIGCP Singapore - prod-ap-southeast-0: Public ProbesGCP UK - prod-gb-south-0: APIGCP UK - prod-gb-south-0: Public ProbesGCP US Central - prod-us-central-0: APIGCP US Central - prod-us-central-0: Public ProbesAWS UAE - prod-me-central-1: APIAWS UAE - prod-me-central-1: Public ProbesAWS UK - prod-gb-south-1: APIAWS UK - prod-gb-south-1: Public ProbesAWS Switzerland - prod-eu-central-0: APIAWS Switzerland - prod-eu-central-0: Public ProbesAWS Indonesia - prod-ap-southeast-2: APIAWS Indonesia - prod-ap-southeast-2: Public ProbesAWS Ireland - prod-eu-west-6: APIAWS Ireland - prod-eu-west-6: Public ProbesGCP Saudi Arabia - prod-me-central-0: APIGCP Saudi Arabia - prod-me-central-0: Public ProbesGCP US East - prod-us-east-1: APIGCP US East - prod-us-east-1: Public Probes
Timeline · 3 updates
  1. investigating Jun 03, 2026, 11:38 AM UTC

    We’re currently investigating an issue affecting Synthetic Monitoring where updates for Scripted/Browser checks might fail. Our team is actively working to identify the cause. Thank you for your patience.

  2. identified Jun 03, 2026, 05:07 PM UTC

    We are in the process of deploying a fix for this issue.

  3. resolved Jun 03, 2026, 06:22 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor June 2, 2026

tempo prod-25 write-path-down

Detected by Pingoru
Jun 02, 2026, 11:15 PM UTC
Resolved
Jun 03, 2026, 07:46 AM UTC
Duration
8h 31m
Affected: AWS UK - prod-gb-south-1
Timeline · 2 updates
  1. identified Jun 02, 2026, 11:15 PM UTC

    we had an outage in tempo-prod-25/24 for the above times Looks like prod-25 was affected from 21:20 to 22:40 and prod-24 from 22:32 to 22:40 UTC

  2. resolved Jun 03, 2026, 07:46 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor June 1, 2026

Alert manager unavailable in prod-us-central-0

Detected by Pingoru
Jun 01, 2026, 07:53 PM UTC
Resolved
Jun 01, 2026, 08:26 PM UTC
Duration
32m
Affected: GCP US Central - prod-us-central-0: Alertmanager and Rules Configuration APIGCP US Central - prod-us-central-0: Alertmanager
Timeline · 3 updates
  1. identified Jun 01, 2026, 07:53 PM UTC

    Starting at 18:30 UTC, we noticed alert manager unavailability limited to prod-us-central-0 which affects grafana-managed and datasource-managed alerting, causing disruption to updating alertmanager config and limited disruption to alert sending. We have identified the cause and are in the process of remediation.

  2. monitoring Jun 01, 2026, 08:02 PM UTC

    A fix has been implemented and we are monitoring the results.

  3. resolved Jun 01, 2026, 08:26 PM UTC

    This incident has been resolved.

Read the full incident report →

Major May 29, 2026

Grafana Loki Log Query Issues

Detected by Pingoru
May 29, 2026, 09:03 AM UTC
Resolved
May 29, 2026, 10:59 AM UTC
Duration
1h 56m
Affected: AWS Australia - prod-ap-southeast-2AWS Brazil - prod-sa-east-1AWS Canada - prod-ca-east-0AWS Germany - prod-eu-west-2AWS UAE - prod-me-central-1AWS India - prod-ap-south-1AWS Japan - prod-ap-northeast-0AWS Singapore - prod-ap-southeast-1AWS Sweden - prod-eu-north-0AWS US West - prod-us-west-0AWS US East - prod-us-east-0AWS Ireland - prod-eu-west-6Azure US Central - us-central7GCP India - prod-ap-south-0GCP Singapore - prod-ap-southeast-0GCP UK - prod-gb-south-0
Timeline · 3 updates
  1. investigating May 29, 2026, 09:03 AM UTC

    We’re currently investigating an issue affecting Loki queries in Grafana. We have had reports from customers showing the logs are not loading or showing missing logs. Our team is actively working to identify the cause. Thank you for your patience.

  2. monitoring May 29, 2026, 10:28 AM UTC

    We have identified the cause of this incident and a fix has been applied. Normal functions are returning. We are currently monitoring the recovery process.

  3. resolved May 29, 2026, 10:59 AM UTC

    This incident has been resolved.

Read the full incident report →

Major May 27, 2026

Prometheus Datasource Errors/Outage in prod-us-east-0

Detected by Pingoru
May 27, 2026, 08:22 PM UTC
Resolved
May 27, 2026, 10:59 PM UTC
Duration
2h 36m
Affected: AWS US East - prod-us-east-0: Querying
Timeline · 4 updates
  1. investigating May 27, 2026, 08:22 PM UTC

    We’re currently investigating an issue affecting Prometheus datasources causing 500 internal or Unexpected errors. Our team is actively working to identify the cause. Thank you for your patience.

  2. investigating May 27, 2026, 09:36 PM UTC

    We continue to investigate an issue affecting Prometheus datasources causing intermittent timeouts and unexpected errors, primarily impacting alert rule evaluations. Our team is actively working to identify the cause. Thank you for your patience.

  3. investigating May 27, 2026, 09:57 PM UTC

    We are seeing recovery across affected Prometheus datasources, and error rates have significantly improved. The service is recovering without any required customer action, and our team continues to monitor stability while we investigate the underlying cause. We’ll provide another update as we learn more.

  4. resolved May 27, 2026, 10:59 PM UTC

    This incident has been resolved. Thank you for your patience.

Read the full incident report →

Minor May 18, 2026

Grafana K6 metrics processing and test runs degradation

Detected by Pingoru
May 18, 2026, 08:24 AM UTC
Resolved
May 18, 2026, 03:42 PM UTC
Duration
7h 18m
Affected: Cloud Test RunsCloud Output Test RunsResult AnalysisPrivate Load Zones
Timeline · 4 updates
  1. investigating May 18, 2026, 08:24 AM UTC

    We’re currently investigating an issue that is resulting in degraded performance in metrics processing and test run metrics may take longer than usual to show up. Our team is actively working to identify the cause. Thank you for your patience.

  2. investigating May 18, 2026, 10:27 AM UTC

    We have identified that test runs are getting timed out as a result of the issue This issue first occurred on May 05/15/2026 at 8:00PM UTC.

  3. monitoring May 18, 2026, 02:30 PM UTC

    We've stabilized the system and test runs no longer result in timeout. There is a small delay (a few minutes) in processing metrics at the end of the test run, but most users shouldn't be too negatively impacted by that. We expected the delay/lag to also resolve within the next 30-60 minutes.

  4. resolved May 18, 2026, 03:42 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor May 13, 2026

Intermittent Errors and High latency Writing to Cloud Metrics, Cloud Logs and Cloud Traces

Detected by Pingoru
May 13, 2026, 08:50 AM UTC
Resolved
May 14, 2026, 07:06 AM UTC
Duration
22h 16m
Affected: GCP US Central - prod-us-central-0GCP US Central - prod-us-central-5GCP US Central - prod-us-central-0: IngestionGCP US Central - prod-us-central-5: IngestionGCP US Central - prod-us-central-0: Ingestion
Timeline · 7 updates
  1. investigating May 13, 2026, 08:50 AM UTC

    We’re currently investigating an issue causing intermittent errors and high latency when writing to Cloud Metrics. Our team is actively working to identify the cause. Thank you for your patience.

  2. investigating May 13, 2026, 10:01 AM UTC

    We’re continuing to investigate the issue causing intermittent errors and high latency when writing to Cloud Metrics. We are in contact with the cloud provider’s support team, and they are investigating the issue alongside us.

  3. investigating May 13, 2026, 10:25 AM UTC

    We have identified expanded impact affecting Grafana Cloud Logs and Grafana Cloud Traces in addition to Cloud Metrics, causing intermittent errors and increased latency when writing data. Our teams continue working on a fix and investigating the issue with the cloud provider’s support team.

  4. monitoring May 13, 2026, 01:37 PM UTC

    We are seeing signs of recovery and improved stability across impacted services over the past hour. Our teams continue to closely monitor the situation while working with the cloud provider.

  5. monitoring May 13, 2026, 03:41 PM UTC

    We continue to see signs of recovery and improved stability across impacted services. Our teams continue to closely monitor the situation while working with the cloud provider.

  6. monitoring May 13, 2026, 09:10 PM UTC

    We continue to see signs of recovery and improved stability across impacted services. Our teams continue to closely monitor the situation while working with the cloud provider.

  7. resolved May 14, 2026, 07:06 AM UTC

    We continue to observe an extended period of recovery and we're marking the incident as resolved at this point in time.

Read the full incident report →

Major May 11, 2026

"Failed to Load Dashboard" Errors

Detected by Pingoru
May 11, 2026, 09:38 PM UTC
Resolved
May 12, 2026, 04:35 PM UTC
Duration
18h 56m
Affected: AWS Australia - prod-ap-southeast-2AWS Brazil - prod-sa-east-1AWS Canada - prod-ca-east-0AWS Germany - prod-eu-west-2AWS Germany - prod-eu-west-4AWS India - prod-ap-south-1AWS Japan - prod-ap-northeast-0AWS UAE - prod-me-central-1AWS Singapore - prod-ap-southeast-1AWS Sweden - prod-eu-north-0AWS US East - prod-us-east-0AWS US East - prod-us-east-2AWS US West - prod-us-west-0AWS Australia - prod-au-southeast-1AWS UK - prod-gb-south-1AWS Ireland - prod-eu-west-6Azure US Central - us-central2AWS Switzerland - prod-eu-central-0Azure Netherlands - prod-eu-west-3GCP Australia - prod-au-southeast-0GCP Belgium - prod-eu-west-0GCP Brazil - prod-sa-east-0GCP India - prod-ap-south-0GCP Singapore - prod-ap-southeast-0GCP UK - prod-gb-south-0GCP US Central - prod-us-central-0GCP US Central - prod-us-central-3GCP US Central - prod-us-central-4GCP US East - prod-us-east-1play.grafana.orgFederal Cloud - AWS US Gov West
Timeline · 5 updates
  1. identified May 11, 2026, 09:38 PM UTC

    Customers on Grafana Cloud may see an error on dashboard panels with "Failed to load dashboard ... json unmarshal number ...". We have identified the issue and are working to deploy out the fix.

  2. identified May 12, 2026, 08:58 AM UTC

    We are continuing to work on a fix for this issue. While we do not have additional updates to share at this time, our teams remain actively engaged and we will provide further updates as soon as they become available.

  3. identified May 12, 2026, 11:11 AM UTC

    Our teams continue working on a fix for this issue. We do not have additional information to share at this time, but we will continue to provide updates as progress is made.

  4. identified May 12, 2026, 02:13 PM UTC

    The fix is currently being rolled out to all impacted environments.

  5. resolved May 12, 2026, 04:35 PM UTC

    This incident has been resolved. Thank you for your patience.

Read the full incident report →

Major May 11, 2026

SSL/TLS Connectivity Issues

Detected by Pingoru
May 11, 2026, 08:49 PM UTC
Resolved
May 11, 2026, 10:40 PM UTC
Duration
1h 51m
Affected: Azure US Central - us-central2
Timeline · 2 updates
  1. investigating May 11, 2026, 08:49 PM UTC

    We are currently investigating reports of service disruption affecting a subset of customers. Customers may experience intermittent connectivity issues, degraded performance, or SSL/TLS certificate validation errors when accessing affected services. Our engineering teams are actively working to identify the scope of impact and restore full functionality as quickly as possible. We will continue to provide updates as more information becomes available.

  2. resolved May 11, 2026, 10:40 PM UTC

    This incident has been resolved. Thank you for your patience.

Read the full incident report →

Minor May 8, 2026

Cloud Metrics -High Write Latency and Errors in prod-us-central-7

Detected by Pingoru
May 08, 2026, 09:16 PM UTC
Resolved
May 08, 2026, 10:30 PM UTC
Duration
1h 13m
Affected: Azure US Central - prod-us-central-7: Ingestion
Timeline · 2 updates
  1. monitoring May 08, 2026, 09:16 PM UTC

    From approximately 20:40-21:00 UTc, we experienced an issue affecting Grafana Cloud Metrics in prod-us-central-7. Affected users may have experienced high latency and/or errors during ingestion and rule evaluation. Our team has identified the cause and mitigated. We are currently monitoring for long-term stability.

  2. resolved May 08, 2026, 10:30 PM UTC

    We have continued to observe stability. This incident is now being considered as resolved. Thank you for your patience.

Read the full incident report →

Critical May 7, 2026

Metrics read errors in prod-ap-south-1 region

Detected by Pingoru
May 07, 2026, 07:18 AM UTC
Resolved
May 07, 2026, 07:56 AM UTC
Duration
38m
Affected: AWS India - prod-ap-south-1: Querying
Timeline · 3 updates
  1. investigating May 07, 2026, 07:18 AM UTC

    From approximately 06:24 UTC, we were alerted to an issue with read errors in mimir-prod-43. Users with instances hosted in the prod-ap-south-1 region experiencing this issue may encounter an error message when querying metrics. Engineering is actively engaged and assessing the issue. We will provide updates accordingly.

  2. monitoring May 07, 2026, 07:53 AM UTC

    Engineering has released a fix and as of 07:50 UTC, customers should no longer experience errors when querying metrics. We will continue to monitor for recurrence and provide updates accordingly.

  3. resolved May 07, 2026, 07:56 AM UTC

    At this time, we have confirmed that the query errors have gone and we are considering this issue resolved.

Read the full incident report →

Minor May 6, 2026

Datasource Query Performance Issues

Detected by Pingoru
May 06, 2026, 08:07 PM UTC
Resolved
May 06, 2026, 08:36 PM UTC
Duration
28m
Affected: Integrations
Timeline · 1 update
  1. investigating May 06, 2026, 08:07 PM UTC

    We’re currently investigating an issue affecting Datasource query performance in prod-us-east-4. Our team is actively working to identify the cause. Thank you for your patience.

Read the full incident report →

Major May 6, 2026

Hardware failure on CSP within prod-us-west-0

Detected by Pingoru
May 06, 2026, 05:30 AM UTC
Resolved
May 06, 2026, 05:30 AM UTC
Duration
Timeline · 1 update
  1. resolved May 08, 2026, 11:04 AM UTC

    We observed an underlying hardware failure on our CSP which triggered an automatic live VM migration. The situation caused a (severe) degradation in write performance for Grafana Cloud Metrics on prod-us-west-0 between 05:26 UTC and 05:43 UTC

Read the full incident report →

Minor May 5, 2026

Elevated Error Rate of Browser Checks in PoP Oregon

Detected by Pingoru
May 05, 2026, 04:11 PM UTC
Resolved
May 05, 2026, 08:13 PM UTC
Duration
4h 2m
Affected: AWS US East - prod-us-east-0: Public Probes
Timeline · 4 updates
  1. investigating May 05, 2026, 04:11 PM UTC

    We’re currently investigating an issue affecting browser checks in the PoP Oregon region. Our team is actively working to identify the cause. Thank you for your patience.

  2. identified May 05, 2026, 06:13 PM UTC

    We’ve identified the cause of the issue impacting browser checks. Our team is currently implementing a fix.

  3. monitoring May 05, 2026, 07:44 PM UTC

    We’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time.

  4. resolved May 05, 2026, 08:13 PM UTC

    This incident has been resolved. Thank you for your patience.

Read the full incident report →

Major May 4, 2026

k6 Partial Outage

Detected by Pingoru
May 04, 2026, 10:58 PM UTC
Resolved
May 05, 2026, 02:09 AM UTC
Duration
3h 11m
Affected: AWS Australia - prod-au-southeast-1: APIGCk6 AppAWS Australia - prod-au-southeast-1: Public ProbesCloud Test RunsAWS Brazil - prod-sa-east-1: APICloud Output Test RunsAWS Brazil - prod-sa-east-1: Public ProbesResult AnalysisAWS Canada - prod-ca-east-0: APITest SchedulingAWS Canada - prod-ca-east-0: Public ProbesPrivate Load ZonesAWS Germany - prod-eu-west-2: APICloud logsAWS Germany - prod-eu-west-2: Public ProbesAWS India - prod-ap-south-1: APIAWS India - prod-ap-south-1: Public ProbesAWS Japan - prod-ap-northeast-0: APIAWS Japan - prod-ap-northeast-0: Public ProbesAWS Singapore - prod-ap-southeast-1: APIAWS Singapore - prod-ap-southeast-1: Public ProbesAWS Sweden - prod-eu-north-0: APIAWS Sweden - prod-eu-north-0: Public ProbesAWS US East - prod-us-east-0: APIAWS US East - prod-us-east-0: Public ProbesAWS US East - prod-us-east-3: APIAWS US East - prod-us-east-3: Public ProbesAWS US West - prod-us-west-0: APIAWS US West - prod-us-west-0: Public ProbesAzure Netherlands - prod-eu-west-3: APIAzure Netherlands - prod-eu-west-3: Public ProbesAzure US Central - prod-us-central2: APIAzure US Central - prod-us-central2: Public ProbesAzure US Central - prod-us-central7: APIAzure US Central - prod-us-central7: Public ProbesGCP Australia - prod-au-southeast-0: APIGCP Australia - prod-au-southeast-0: Public ProbesGCP Belgium - prod-eu-west-0: APIGCP Belgium - prod-eu-west-0: Public ProbesGCP Brazil - prod-sa-east-0: APIGCP Brazil - prod-sa-east-0: Public ProbesGCP India - prod-ap-south-0: APIGCP India - prod-ap-south-0: Public ProbesGCP Singapore - prod-ap-southeast-0: APIGCP Singapore - prod-ap-southeast-0: Public ProbesGCP UK - prod-gb-south-0: APIGCP UK - prod-gb-south-0: Public ProbesGCP US Central - prod-us-central-0: APIGCP US Central - prod-us-central-0: Public ProbesAWS UAE - prod-me-central-1: APIAWS UAE - prod-me-central-1: Public ProbesAWS UK - prod-gb-south-1: APIAWS UK - prod-gb-south-1: Public ProbesAWS Switzerland - prod-eu-central-0: APIAWS Switzerland - prod-eu-central-0: Public ProbesAWS Indonesia - prod-ap-southeast-2: APIAWS Indonesia - prod-ap-southeast-2: Public ProbesAWS Ireland - prod-eu-west-6: APIAWS Ireland - prod-eu-west-6: Public ProbesGCP Saudi Arabia - prod-me-central-0: APIGCP Saudi Arabia - prod-me-central-0: Public ProbesGCP US East - prod-us-east-1: APIGCP US East - prod-us-east-1: Public Probes
Timeline · 4 updates
  1. investigating May 04, 2026, 10:58 PM UTC

    We’re currently investigating an issue affecting k6. Our team is actively working to identify the cause. Thank you for your patience.

  2. investigating May 04, 2026, 11:23 PM UTC

    After further investigation, this issue may also be affecting Synthetic Monitoring. We continue to identify the cause and will update as soon as we have more information.

  3. monitoring May 05, 2026, 12:04 AM UTC

    We’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time.

  4. resolved May 05, 2026, 02:09 AM UTC

    This incident has been resolved. Thank you for your patience.

Read the full incident report →

Major May 1, 2026

Ingestion Errors for AWS Cloud Provider Observability Metric Streams in prod-us-central-7

Detected by Pingoru
May 01, 2026, 09:14 AM UTC
Resolved
May 01, 2026, 10:27 AM UTC
Duration
1h 13m
Affected: Grafana Cloud: AWS Metrics
Timeline · 4 updates
  1. investigating May 01, 2026, 09:14 AM UTC

    We are investigating an issue with ingesting Metrics for AWS Cloud Provider Observability with Metric Streams. Users experiencing this issue may encounter ingestion errors in the "prod-us-central-7" region only. Engineering is actively engaged and assessing the issue. We will provide updates accordingly.

  2. investigating May 01, 2026, 09:42 AM UTC

    We are continuing to investigate this issue.

  3. monitoring May 01, 2026, 09:43 AM UTC

    A fix has been implemented and we are monitoring the results.

  4. resolved May 01, 2026, 10:27 AM UTC

    This incident has been resolved.

Read the full incident report →