Blacksmith Outage History

Blacksmith is up right now

Blacksmith had 51 outages in the last 2 years totaling 237h 44m of downtime — averaging 2.1 incidents per month.

There were 51 Blacksmith outages since August 6, 2025 totaling 237h 44m of downtime. Each is summarised below — incident details, duration, and resolution information.

Source: https://status.blacksmith.sh

Major June 17, 2026

Job adoption delays

Detected by Pingoru
Jun 17, 2026, 12:45 PM UTC
Resolved
Jun 17, 2026, 01:48 PM UTC
Duration
1h 3m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOSGithub → ActionsGithub → Webhooks
Timeline · 6 updates
  1. investigating Jun 17, 2026, 12:45 PM UTC

    We are receiving reports of job adoption delays, we are currently investigating.

  2. investigating Jun 17, 2026, 01:01 PM UTC

    We believe this is related to an issue with Github Webhooks and are still investigating.

  3. identified Jun 17, 2026, 01:14 PM UTC

    We have identified the issue as GitHub sending us a high cardinality of malformed webhooks missing critical pieces of information in the payloads. We are working on a patch to work around this as we wait for GitHub to fix the upstream issue.

  4. monitoring Jun 17, 2026, 01:27 PM UTC

    Our engineers are implementing a mitigation which should take 20 minutes to deploy. Thank you for your patience.

  5. monitoring Jun 17, 2026, 01:48 PM UTC

    We implemented a fix and are currently monitoring the result but seeing recovery. Any existing queued jobs wont self recover so please be sure to cancel and rerun those jobs.

  6. resolved Jun 17, 2026, 01:48 PM UTC

    This incident has been resolved.

Read the full incident report →

Major June 15, 2026

Github are reporting degraded performance for Webhooks

Detected by Pingoru
Jun 15, 2026, 03:45 PM UTC
Resolved
Jun 15, 2026, 04:44 PM UTC
Duration
59m
Affected: Github → Webhooks
Timeline · 2 updates
  1. monitoring Jun 15, 2026, 03:45 PM UTC

    GitHub are reporting degraded performance with Webhooks. This may have an impact on job adoption. We are monitoring this incident -

  2. resolved Jun 15, 2026, 04:44 PM UTC

    Github have reported this incident as resolved.

Read the full incident report →

Minor June 11, 2026

Currently investigating our degraded us-west storage cluster

Detected by Pingoru
Jun 11, 2026, 08:54 PM UTC
Resolved
Jun 11, 2026, 11:02 PM UTC
Duration
2h 8m
Affected: Incremental Docker Builders (us-west Storage Cluster)Docker Container Cache (us-west Storage Cluster)Stickydisks (us-west Storage Cluster)
Timeline · 2 updates
  1. investigating Jun 11, 2026, 08:54 PM UTC

    We are currently investigating this incident.

  2. resolved Jun 11, 2026, 11:02 PM UTC

    Speeds have returned to normal and this incident has been resolved.

Read the full incident report →

Major June 11, 2026

Delays In Job Adoption

Detected by Pingoru
Jun 11, 2026, 04:35 PM UTC
Resolved
Jun 11, 2026, 07:38 PM UTC
Duration
3h 2m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 4 updates
  1. investigating Jun 11, 2026, 04:35 PM UTC

    We're seeing a spike in jobs in some regions leading to a temporary capacity shortage. We're looking into re-balancing to reduce the delays

  2. identified Jun 11, 2026, 05:19 PM UTC

    We are continuing to work on a fix for this incident. eu-central and us-west still experiencing some queueing, and eu-west has recovered.

  3. monitoring Jun 11, 2026, 06:41 PM UTC

    us-west still experiencing some mild queueing, and eu-central and eu-west has recovered.

  4. resolved Jun 11, 2026, 07:38 PM UTC

    This incident has been resolved.

Read the full incident report →

Major June 10, 2026

Github reporting API request degradation

Detected by Pingoru
Jun 10, 2026, 03:29 PM UTC
Resolved
Jun 10, 2026, 04:52 PM UTC
Duration
1h 23m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOSGithub → API Requests
Timeline · 3 updates
  1. investigating Jun 10, 2026, 03:29 PM UTC

    We are currently investigating some reports of delays with job adoption. We are actively looking into this.

  2. monitoring Jun 10, 2026, 04:06 PM UTC

    We are currently in the process of re-queuing jobs

  3. resolved Jun 10, 2026, 04:52 PM UTC

    Github have reported this incident as resolved

Read the full incident report →

Major June 9, 2026

Delays in job adoption

Detected by Pingoru
Jun 09, 2026, 07:35 PM UTC
Resolved
Jun 09, 2026, 10:19 PM UTC
Duration
2h 44m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 5 updates
  1. investigating Jun 09, 2026, 07:35 PM UTC

    We are currently investigating this incident.

  2. identified Jun 09, 2026, 07:37 PM UTC

    The issue has been identified and fixed, however there will still be some delays in adopting job time as we catch up.

  3. monitoring Jun 09, 2026, 08:29 PM UTC

    eu regions should now be caught up. Still some delays in our us region.

  4. monitoring Jun 09, 2026, 09:50 PM UTC

    Still experiencing some queuing in us-west. Implementing a patch to help increase adoptions.

  5. resolved Jun 09, 2026, 10:19 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor June 8, 2026

Oauth Issues Accessing Blacksmith Dashboard

Detected by Pingoru
Jun 08, 2026, 01:44 PM UTC
Resolved
Jun 08, 2026, 04:37 PM UTC
Duration
2h 53m
Affected: Website
Timeline · 3 updates
  1. investigating Jun 08, 2026, 01:44 PM UTC

    We are currently investigating an issue affecting some customers ability to log into the dashboard

  2. monitoring Jun 08, 2026, 03:12 PM UTC

    We now see users logging into the dashboard. The issue appears to be upstream and we are continuing to monitor.

  3. resolved Jun 08, 2026, 04:37 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor June 8, 2026

GitHub reporting degraded performance for Actions

Detected by Pingoru
Jun 08, 2026, 08:34 AM UTC
Resolved
Jun 08, 2026, 08:43 AM UTC
Duration
8m
Affected: Github → Actions
Timeline · 2 updates
  1. monitoring Jun 08, 2026, 08:34 AM UTC

    GitHub is reporting degraded performance with Actions as well as other services. This may have an impact on job adoption. We are monitoring this incident -

  2. resolved Jun 08, 2026, 08:43 AM UTC

    GitHub have resolved this incident.

Read the full incident report →

Major May 27, 2026

Delay in picking up jobs

Detected by Pingoru
May 27, 2026, 07:01 PM UTC
Resolved
May 28, 2026, 12:18 AM UTC
Duration
5h 16m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 5 updates
  1. investigating May 27, 2026, 07:01 PM UTC

    We are currently investigating this incident.

  2. identified May 27, 2026, 07:56 PM UTC

    We've deployed mitigations and are observing improvements.

  3. monitoring May 27, 2026, 08:24 PM UTC

    Continuing to monitor improvements from deployment.

  4. monitoring May 27, 2026, 10:35 PM UTC

    We're continuing to monitor the situation. There's still a slight delay in jobs, but cancelling and rerunning the jobs should resolve the issue.

  5. resolved May 28, 2026, 12:18 AM UTC

    This incident has been resolved.

Read the full incident report →

Minor May 20, 2026

Delays in job adoption in eu-west

Detected by Pingoru
May 20, 2026, 04:33 PM UTC
Resolved
May 20, 2026, 07:13 PM UTC
Duration
2h 39m
Affected: eu-west x86
Timeline · 3 updates
  1. investigating May 20, 2026, 04:33 PM UTC

    We're seeing a spike of jobs in the eu-west region leading to a temporary capacity shortage. We're looking into re-balancing to reduce the delays

  2. identified May 20, 2026, 05:30 PM UTC

    We are continuing to monitor this, as well as track an incident reported by with GitHub Actions which can be found here: We are continuing to re-balance in order to reduce delays.

  3. resolved May 20, 2026, 07:13 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor May 20, 2026

Transparent caching degradaded in eu-west

Detected by Pingoru
May 20, 2026, 01:01 PM UTC
Resolved
May 27, 2026, 03:25 PM UTC
Duration
7d 2h
Affected: Actions Cache
Timeline · 4 updates
  1. monitoring May 20, 2026, 01:01 PM UTC

    We noticed a small uptick in transparent caching failures in our eu-west region, and have applied a potential fix.

  2. identified May 20, 2026, 02:41 PM UTC

    We have identified a bad host in the fleet and are working on replacing it. Client cache interactions are recovering as we rollout this fix.

  3. identified May 20, 2026, 10:59 PM UTC

    We're currently adding more capacity to the cluster. There will be a few minutes of slow cache interactions while the operation completes.

  4. monitoring May 21, 2026, 12:00 AM UTC

    The cluster expansion is complete. Our testing shows that cache interactions are no longer degraded. We're monitoring the cluster to confirm complete recovery.

Read the full incident report →

Minor May 19, 2026

GitHub webhooks degraded causing job queueing

Detected by Pingoru
May 19, 2026, 02:50 PM UTC
Resolved
May 19, 2026, 11:39 PM UTC
Duration
8h 49m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 6 updates
  1. investigating May 19, 2026, 02:50 PM UTC

    We're seeing evidence of webhooks delivery being degraded from GitHub. We're investigating.

  2. monitoring May 19, 2026, 03:53 PM UTC

    We're seeing upstream recovery for GitHub webhook deliveries. Jobs may queue as we process GitHub's backlog of webhook events.

  3. monitoring May 19, 2026, 05:10 PM UTC

    We're still seeing a large backlog of queued jobs due to the incident that the system is working through, we're exploring mitigations.

  4. monitoring May 19, 2026, 08:02 PM UTC

    We're still working through a substantial backlog of queued tasks that has accumulated over this period of delayed webhook arrivals.

  5. monitoring May 19, 2026, 09:22 PM UTC

    We are close to the end of the backlog of the queued tasks and are seeing full recovery in certain runner pools.

  6. resolved May 19, 2026, 11:39 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor May 15, 2026

Github Actions degraded

Detected by Pingoru
May 15, 2026, 08:20 AM UTC
Resolved
May 15, 2026, 08:29 AM UTC
Duration
9m
Affected: Github → Actions
Timeline · 2 updates
  1. identified May 15, 2026, 08:20 AM UTC

    GitHub is reporting a GitHub Actions degradation, and we're monitoring the impact -

  2. resolved May 15, 2026, 08:29 AM UTC

    Github have reportedly resolved the incident

Read the full incident report →

Minor May 14, 2026

Job Queueing due to Capacity Event

Detected by Pingoru
May 14, 2026, 09:20 PM UTC
Resolved
May 14, 2026, 10:30 PM UTC
Duration
1h 9m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 2 updates
  1. investigating May 14, 2026, 09:20 PM UTC

    We are currently experiencing a high volume of jobs which is causing queuing in job adoption.

  2. resolved May 14, 2026, 10:30 PM UTC

    This incident has been resolved.

Read the full incident report →

Major May 12, 2026

Webhook Service Outage

Detected by Pingoru
May 12, 2026, 08:39 PM UTC
Resolved
May 12, 2026, 11:46 PM UTC
Duration
3h 7m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 3 updates
  1. investigating May 12, 2026, 08:39 PM UTC

    We are currently investigating this incident.

  2. monitoring May 12, 2026, 08:42 PM UTC

    We implemented a fix and are currently monitoring the result.

  3. resolved May 12, 2026, 11:46 PM UTC

    We have been running into load issues in our database layer that's causing sporadic instability in job adoption times. We have been actively working with our database provider to improve our systems to handle the recently increased load. We are planning proactive maintenance later this week to address this further.

Read the full incident report →

Major May 11, 2026

Blacksmith control plane outage

Detected by Pingoru
May 11, 2026, 03:56 PM UTC
Resolved
May 11, 2026, 10:31 PM UTC
Duration
6h 35m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 5 updates
  1. investigating May 11, 2026, 03:56 PM UTC

    There may be some delays with job adoption

  2. monitoring May 11, 2026, 03:58 PM UTC

    We implemented a fix and are re-queuing jobs in the backlog

  3. monitoring May 11, 2026, 07:32 PM UTC

    We are still investigating issues with slow job adoption.

  4. identified May 11, 2026, 08:30 PM UTC

    We are continuing to face issues with our upstream database provider. We are working with them to root cause the problem.

  5. resolved May 11, 2026, 10:31 PM UTC

    This incident has been resolved. Queue times are back to normal, we are working on mitigations on our end to prevent such load based degradations in the future.

Read the full incident report →

Minor May 9, 2026

Elevated queue time for mac runners

Detected by Pingoru
May 09, 2026, 08:58 PM UTC
Resolved
May 09, 2026, 10:37 PM UTC
Duration
1h 39m
Affected: us-central MacOS
Timeline · 3 updates
  1. investigating May 09, 2026, 08:58 PM UTC

    We are currently investigating this incident.

  2. identified May 09, 2026, 09:38 PM UTC

    A fix for the issue has been deployed, we expect queue times to start returning to normal momentarily.

  3. resolved May 09, 2026, 10:37 PM UTC

    This incident has been resolved, mac queue times are back to normal.

Read the full incident report →

Major May 8, 2026

Elevated tail latency in git checkout operations

Detected by Pingoru
May 08, 2026, 06:22 PM UTC
Resolved
May 08, 2026, 08:30 PM UTC
Duration
2h 8m
Affected: us-west ARMus-west x86
Timeline · 3 updates
  1. identified May 08, 2026, 06:22 PM UTC

    We have been made aware of an upstream degradation in our US West region by our network provider. This is manifesting as some % of git checkout operations taking much longer than their baseline. We are working with our upstream provider to restore functionality.

  2. monitoring May 08, 2026, 07:54 PM UTC

    The upstream degradation has been resolved, we are seeing checkout times return to normal.

  3. resolved May 08, 2026, 08:30 PM UTC

    This incident has been resolved, checkout operations are back to their baseline.

Read the full incident report →

Minor May 4, 2026

GitHub Actions and Webhooks degradation

Detected by Pingoru
May 04, 2026, 04:04 PM UTC
Resolved
May 04, 2026, 06:13 PM UTC
Duration
2h 9m
Affected: Github → ActionsGithub → Webhooks
Timeline · 3 updates
  1. monitoring May 04, 2026, 04:04 PM UTC

    GitHub is reporting a GitHub Actions and Webhooks degradation, and we're monitoring the impact

  2. monitoring May 04, 2026, 04:36 PM UTC

    GitHub are still experiencing issues with Webhooks

  3. resolved May 04, 2026, 06:13 PM UTC

    This incident has been resolved.

Read the full incident report →

Major April 30, 2026

Degraded Actions performance in us-west

Detected by Pingoru
Apr 30, 2026, 01:17 PM UTC
Resolved
Apr 30, 2026, 05:36 PM UTC
Duration
4h 18m
Affected: us-west x86
Timeline · 3 updates
  1. monitoring Apr 30, 2026, 01:17 PM UTC

    We are aware of some reports of slow Actions performance in our us-west region. A mitigation has been applied and we're now seeing signs of recovery

  2. monitoring Apr 30, 2026, 02:11 PM UTC

    We are continuing to see recovery in us-west

  3. resolved Apr 30, 2026, 05:36 PM UTC

    This incident has been resolved.

Read the full incident report →

Minor April 28, 2026

Blacksmith control plane DB maintenance

Detected by Pingoru
Apr 28, 2026, 03:45 PM UTC
Resolved
Apr 28, 2026, 05:50 PM UTC
Duration
2h 4m
Affected: Blacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 4 updates
  1. investigating Apr 28, 2026, 03:45 PM UTC

    We are performing brief maintenance on our database. Customers may experience some delays with job adoption

  2. monitoring Apr 28, 2026, 03:47 PM UTC

    We have completed the DB operation and currently re-queuing jobs. This may take \~15 minutes

  3. monitoring Apr 28, 2026, 04:44 PM UTC

    The re-queuing is still in progress however new jobs are being picked up without delay.

  4. resolved Apr 28, 2026, 05:50 PM UTC

    This incident has been resolved.

Read the full incident report →

Major April 27, 2026

Github Actions Degraded

Detected by Pingoru
Apr 27, 2026, 04:38 PM UTC
Resolved
Apr 28, 2026, 12:01 AM UTC
Duration
7h 22m
Affected: Status | Blacksmith
Timeline · 2 updates
  1. investigating Apr 27, 2026, 04:38 PM UTC

    GitHub is reporting a GitHub Actions degradation, and we're monitoring the impact

  2. resolved Apr 28, 2026, 12:01 AM UTC

    GitHub resolved their upstream outage and we are not seeing any customer impact.

Read the full incident report →

Minor April 23, 2026

Github Actions and Packages degraded

Detected by Pingoru
Apr 23, 2026, 02:57 PM UTC
Resolved
Apr 23, 2026, 05:31 PM UTC
Duration
2h 34m
Affected: Status | BlacksmithBlacksmith Managed Runners (eu-central ARM)eu-central x86us-west ARMus-west x86eu-west x86us-central MacOS
Timeline · 4 updates
  1. investigating Apr 23, 2026, 02:57 PM UTC

    We are currently monitoring the effects of a Github outage:

  2. resolved Apr 23, 2026, 03:24 PM UTC

    This incident has been resolved by Github.

  3. monitoring Apr 23, 2026, 03:28 PM UTC

    It seems that there's still trailing effects from the incident and we're seeing queue times due to slow webhook delivery.

  4. resolved Apr 23, 2026, 05:31 PM UTC

    This incident has been resolved by Github.

Read the full incident report →

Minor April 22, 2026

Sticky Disk degradation in the US-WEST cluster

Detected by Pingoru
Apr 22, 2026, 05:50 PM UTC
Resolved
Apr 22, 2026, 08:14 PM UTC
Duration
2h 23m
Affected: Status | BlacksmithIncremental Docker BuildersDocker Container CacheStickydisks
Timeline · 3 updates
  1. investigating Apr 22, 2026, 05:50 PM UTC

    We are currently investigating this incident.

  2. monitoring Apr 22, 2026, 06:41 PM UTC

    We needed to perform an emergency maintenance operation in our storage cluster. We are observing the recovery and seeing docker build speeds improve.

  3. resolved Apr 22, 2026, 08:14 PM UTC

    This incident has been resolved.

Read the full incident report →