Is Trigger.dev down?
Last checked 8m agoNo incidents right now.
Trigger.dev is operational right now. Last checked 8m ago; the most recent incident resolved 10d ago.
Real-time Trigger.dev status, recent outages, and incident history — pulled directly from Trigger.dev's official status page at https://status.trigger.dev every 5 minutes. Pingoru tracks 6 Trigger.dev services and has captured 13 incidents in the last 90 days (97.96% uptime). Get email, Slack, Discord, or webhook alerts the moment Trigger.dev reports a new incident — free for 5 monitors, no credit card.
Recent outages & incidents
Past 90 days-
Timeline · 1 update
- resolved · Jun 03, 2026, 07:35 PM UTC
Deployments recovered
Latest: Deployments recovered
-
-
Timeline · 1 update
- resolved · May 25, 2026, 05:25 PM UTC
Deployments recovered
Latest: Deployments recovered
-
- Trigger.dev cloudTrigger.dev OpenTelemetry
Timeline · 3 updates
- investigating · Apr 21, 2026, 03:40 PM UTC
We are seeing some DNS resolution issues in us-east-1. OTel spans in the web dashboard are also affected. We are investigating the root cause. In the meantime, please consider switching to the eu-central-1 region while we investigate.
- resolved · Apr 23, 2026, 05:15 PM UTC
Postmortem: Intermittent DNS Resolution Failures (US region) Date: 2026-04-21 Duration: ~1 hour Impact: Intermittent DNS resolution failures for tasks running in our US region Summary Between approximately 15:50 and 16:50 UTC on 2026-04-21, some tasks running in our US region experienced intermittent DNS resolution failures when connecting to external services such as databases, object storage, and third-party APIs. The errors typically surfaced as EAI_AGAIN or transient name-resolution timeouts. Our EU region was unaffected. The platform itself remained available throughout, and the incident self-resolved before any manual intervention. What customers experienced Tasks in the US region that made outbound connections to external hostnames during the affected window could see intermittent resolution failures. Cluster-internal communication and task scheduling were not impacted. We did not observe degradation of the dashboard, API, or engine. Root cause Under elevated workload, an internal telemetry service hit its memory limit and restarted. When it came back up, a large number of running tasks simultaneously re-established their connections to it, triggering a burst of DNS lookups for a single internal hostname. A default DNS configuration on our pods amplified that burst by roughly 10x at the wire level, because each hostname resolution attempted several search-path variants, and each of those was duplicated across the IPv4 and IPv6 query families. The telemetry service then went into a short restart loop, so the reconnection surge repeated, keeping DNS resolution degraded across the window. Some external name-resolution attempts from customer task code ran long enough to hit their own timeouts and surfaced as EAI_AGAIN. During the incident, our cluster DNS resolver remained healthy by its own metrics. The failure was silent at the node-network layer and was only fully reconstructable from after-the-fact packet capture. This is the main reason we could not attribute the degradation in real time, and it is the primary gap we have closed in the days since. What we've done Further scaled the telemetry service horizontally, so there's more capacity headroom under workload spikes, and a simultaneous all-replicas restart is less likely. Deployed a per-node DNS cache across our worker fleet to absorb redundant lookup patterns at source, before they reach the cluster DNS resolver. Reduced DNS amplification at source on platform-managed services, so a single hostname lookup produces one wire query instead of ten. Extended our monitoring for network-level silent drops, per-upstream DNS resolver latency, and telemetry pipeline health. Shipped a suite of production alerts calibrated against this incident's actual signature. A future incident of this class should be detected in minutes rather than reconstructed after the fact. Ongoing A separate observation about how our cluster DNS resolver forwards queries under load is under continued investigation. It is not currently known to cause user-facing impact, but it is a gap in our understanding that we are closing. We are re-evaluating DNS defaults on customer-facing tasks to remove the amplification pattern entirely, rather than just absorbing it with caching. We take incidents of this class seriously even when they self-resolve. The multi-day investigation was a deliberate choice to reconstruct the event with enough fidelity to fix the underlying observability gap, not just the immediate symptoms. We apologise for the disruption.
- investigating · Apr 23, 2026, 05:15 PM UTC
## Postmortem: Intermittent DNS Resolution Failures (US region) **Date**: 2026-04-21 **Duration**: ~1 hour **Impact**: Intermittent DNS resolution failures for tasks running in our US region ### Summary Between approximately 15:50 and 16:50 UTC on 2026-04-21, some tasks running in our US region experienced intermittent DNS resolution failures when connecting to external services such as databases, object storage, and third-party APIs. The errors typically surfaced as `EAI_AGAIN` or transient name-resolution timeouts. Our EU region was unaffected. The platform itself remained available throughout, and the incident self-resolved before any manual intervention. ### What customers experienced Tasks in the US region that made outbound connections to external hostnames during the affected window could see intermittent resolution failures. Cluster-internal communication and task scheduling were not impacted. We did not observe degradation of the dashboard, API, or engine. ### Root cause Under elevated workload, an internal telemetry service hit its memory limit and restarted. When it came back up, a large number of running tasks simultaneously re-established their connections to it, triggering a burst of DNS lookups for a single internal hostname. A default DNS configuration on our pods amplified that burst by roughly 10x at the wire level, because each hostname resolution attempted several search-path variants, and each of those was duplicated across the IPv4 and IPv6 query families. The telemetry service then went into a short restart loop, so the reconnection surge repeated, keeping DNS resolution degraded across the window. Some external name-resolution attempts from customer task code ran long enough to hit their own timeouts and surfaced as `EAI_AGAIN`. During the incident, our cluster DNS resolver remained healthy by its own metrics. The failure was silent at the node-network layer and was only fully reconstructable from after-the-fact packet capture. This is the main reason we could not attribute the degradation in real time, and it is the primary gap we have closed in the days since. ### What we've done - **Further scaled the telemetry service** horizontally, so there's more capacity headroom under workload spikes, and a simultaneous all-replicas restart is less likely. - **Deployed a per-node DNS cache** across our worker fleet to absorb redundant lookup patterns at source, before they reach the cluster DNS resolver. - **Reduced DNS amplification at source** on platform-managed services, so a single hostname lookup produces one wire query instead of ten. - **Extended our monitoring** for network-level silent drops, per-upstream DNS resolver latency, and telemetry pipeline health. - **Shipped a suite of production alerts** calibrated against this incident's actual signature. A future incident of this class should be detected in minutes rather than reconstructed after the fact. ### Ongoing - A separate observation about how our cluster DNS resolver forwards queries under load is under continued investigation. It is not currently known to cause user-facing impact, but it is a gap in our understanding that we are closing. - We are re-evaluating DNS defaults on customer-facing tasks to remove the amplification pattern entirely, rather than just absorbing it with caching. We take incidents of this class seriously even when they self-resolve. The multi-day investigation was a deliberate choice to reconstruct the event with enough fidelity to fix the underlying observability gap, not just the immediate symptoms. We apologise for the disruption.
Latest: ## Postmortem: Intermittent DNS Resolution Failures (US region) **Date**: 2026-04-21 **Duration**: ~1 hour **Impact**: Intermittent DNS resolution failures for tasks running in our…
-
-
Timeline · 1 update
- resolved · Apr 16, 2026, 04:57 PM UTC
Deployments recovered
Latest: Deployments recovered
-
- Trigger.dev cloud
Timeline · 2 updates
- investigating · Apr 15, 2026, 02:03 PM UTC
This can impact HTTP requests including database queries. It seems to be impacting up to 5% of database queries for some users.
- resolved · Apr 15, 2026, 02:40 PM UTC
The issue is now resolved and DNS service is fully recovered. The degradation was caused by abnormal DNS load patterns, which caused intermittent issues with external resolvers. We've already applied some changes to harden our networking infrastructure and will monitor closely to confirm stability.
Latest: The issue is now resolved and DNS service is fully recovered. The degradation was caused by abnormal DNS load patterns, which caused intermittent issues with external resolvers. We…
-
See the full Trigger.dev outage history
6 more incidents in the last 90 days, plus the full multi-year archive of per-service events and update timelines.
Browse Trigger.dev outage history →Or sign up free to get alerts when Trigger.dev breaks · 10 free monitors · No credit card
- Deployments recovered ResolvedStarted Jun 03, 2026, 07:35 PM UTC · Resolved Jun 03, 2026, 07:35 PM UTC · —
- Deployments recovered ResolvedStarted May 25, 2026, 05:25 PM UTC · Resolved May 25, 2026, 05:25 PM UTC · —
- DNS in us-east-1 is degraded ResolvedStarted Apr 21, 2026, 03:40 PM UTC · Resolved Apr 23, 2026, 05:15 PM UTC · 2d 1h
- Deployments recovered ResolvedStarted Apr 16, 2026, 04:57 PM UTC · Resolved Apr 16, 2026, 04:57 PM UTC · —
- DNS in us-east-1 is degraded ResolvedStarted Apr 15, 2026, 02:03 PM UTC · Resolved Apr 15, 2026, 02:40 PM UTC · 37m
- Realtime recovered ResolvedStarted Apr 01, 2026, 07:09 PM UTC · Resolved Apr 01, 2026, 07:09 PM UTC · —
- Realtime recovered ResolvedStarted Apr 01, 2026, 05:21 PM UTC · Resolved Apr 01, 2026, 05:21 PM UTC · —
- Realtime recovered ResolvedStarted Apr 01, 2026, 04:38 PM UTC · Resolved Apr 01, 2026, 04:38 PM UTC · —