Coveralls Outage History
Coveralls is up right nowCoveralls had 18 outages in the last 2 years totaling 943h 48m of downtime — averaging 0.7 incidents per month.
There were 18 Coveralls outages since June 2, 2025 totaling 943h 48m of downtime. Each is summarised below — incident details, duration, and resolution information.
Unschedule maintenance
Timeline · 2 updates
- investigating May 10, 2026, 02:44 PM UTC
We are currently investigating an issue with an apparently expired SSL cert. Our orchestration layer shows no issue with certs so we need to dig deeper.
- resolved May 10, 2026, 03:11 PM UTC
This incident has been resolved.
Service outage (RESTORED, MONITORING)
Timeline · 16 updates
All Systems Operational
Timeline · 1 update
- resolved Feb 02, 2026, 05:33 PM UTC
Just a note to address the gap in our Coverage Calculation Background Job Dequeue Time Graph today, MON, FEB 2 from 00:15:00 PST to 07:35:00 PST: Coveralls experienced no disruption in service at this time. Instead, a deployment issue cause the cron job that reports the metric to fail until it was resolved at 07:35:00 PST.
Elevated Latency in APAC and EU
Timeline · 3 updates
- identified Dec 19, 2025, 02:34 PM UTC
We are continuing to experience elevated latency in APAC and EU.
- monitoring Dec 19, 2025, 02:52 PM UTC
We are continuing to monitor as we scale to clear backlogged jobs.
- resolved Dec 19, 2025, 03:24 PM UTC
We had to pause some queues to recover performance for new jobs, but will clear those as soon as we've recovered normal build times for new jobs (ETA: 15-min). If you are a user in APAC or EU, you may have had your jobs paused. One repo in particular has represented 90% overnight workload. We will reach out to that user.
Elevated Latency
Timeline · 3 updates
- monitoring Nov 17, 2025, 04:35 PM UTC
We received reports from EU customers of elevated latency and have resolved the cause. We will be managing workloads to ensure standard processing times for all new builds, and scaling resources to clear delayed builds. Follow this incident for updates.
- monitoring Nov 17, 2025, 05:28 PM UTC
Build times are returning to normal for all builds created since 05:00AM PDT. Older builds are being processed by scaled resources and should clear in 45min to 1 hr. If you are having issues with a slow or stuck build, especially if priority, feel free to reach out to us at [email protected]. These steps will save time: 1) Mention this incident: https://status.coveralls.io/incidents/hkqt790213m5 2) Share your Coveralls Build URL (from your CI build log), or your CI build number.
- resolved Nov 17, 2025, 07:43 PM UTC
This incident has been resolved. Build times for all new builds is normal across the board. We are still clearing a backlog of background jobs, which should be clear in the next 15-20-min. If you are having issues with a slow or stuck build, feel free to reach out to us at [email protected]. These steps will save time: 1) Mention this incident: https://status.coveralls.io/incidents/hkqt790213m5 2) Share your Coveralls Build URL (from your CI build log), or your CI build number.
504 Gateway Timeouts (Resolved)
Timeline · 2 updates
- monitoring Oct 24, 2025, 06:35 PM UTC
We received reports of 504 Gateway Timeouts from our Web Servers (coverage report uploads) between 6:00am-6:20am PDT and several moments ago between 11:15am-11:30am PDT. We have recovered service to all web servers and are monitoring for further occurrences.
- resolved Oct 24, 2025, 09:54 PM UTC
We are closing this incident having received no further reports of 504 errors today. We will continue to monitor for them.
Some reports of 504 Timeouts
Timeline · 2 updates
- monitoring Oct 13, 2025, 06:02 PM UTC
We have received some reports of 504 errors this morning. We are monitoring to avoid further occurrences. We are looking into any gaps in our new monitoring regime, which has drastically reduced occurrences, to see where we can further improve.
- resolved Oct 13, 2025, 11:15 PM UTC
We have received no further reports of 504 timeout errors on coverage report uploads today, but we continue to monitor and will continue trying to improve our mitigations.
504 Timeouts (Resolved)
Timeline · 3 updates
- monitoring Oct 10, 2025, 03:23 PM UTC
We've received results of 504 Timeout errors from 7am PDT through 8:15am PDT. We have addressed the issue and issue was resolved as 8:19am PDT. We are monitoring for further issues.
- monitoring Oct 10, 2025, 04:46 PM UTC
Based on the details of today's incident, we've added another layer of monitoring that should give us earlier indications of this error state and allow us to respond more quickly.
- resolved Oct 10, 2025, 07:00 PM UTC
This incident is resolved. We have applied an additional layer of monitoring that should help us catch these cases earlier.
504 Timeout Errors on Coverage Uploads
Timeline · 2 updates
- monitoring Sep 26, 2025, 03:28 PM UTC
All systems operational. During our US overnight hours (PDT / GMT-7), we received additional reports of 504 timeout errors affecting coverage uploads. The issue has since been resolved, and we are actively monitoring for further occurrences. We have already implemented several mitigations over the past few weeks, and additional measures will be deployed today to further reduce the likelihood of these errors, particularly for our international customers. If you experience a 504 error, the most helpful details you can provide are: 1. A timestamp or timeframe of the error 2. Any Cloudflare Ray ID shown on the error page Please share these details in our public tracking issue here: https://github.com/lemurheavy/coveralls-public/issues/1824 These reports are invaluable as we continue to investigate and refine our mitigations.
- resolved Sep 29, 2025, 03:56 PM UTC
We are closing this incident after recent mitigations and a weekend without any reports. We continue to implement mitigations and infrastructure changes we believe will further reduce incidents of this error type.
Elevated 504 Timeout Errors
Timeline · 10 updates
More reports of “Website under heavy load”
Timeline · 6 updates
Reports of "Website under heavy load" errors
Timeline · 4 updates
Service unavailable with HTML error page or 500 errors
Timeline · 3 updates
Intermittent request rejections
Timeline · 3 updates
- identified Aug 13, 2025, 05:14 PM UTC
We've received reports that some used are receiving Web or API responses of: "This website is under heavy load (queue full)." We have identified the source of the issue and are working to resolve it now.
- resolved Aug 13, 2025, 07:24 PM UTC
Closing this incident as a potential false alarm. Connection stats show no issues and we cannot reproduce any responses of, "This website is under heavy load (queue full)." Nor do we see any in logs from this morning. We now believe the report we received this morning may have actually been encountered yesterday, around the same time as another report, before the incident was resolved. (Confirmed.) Please let us know if you have received the Web or API response, "This website is under heavy load (queue full)," and when you encountered it—which would be especially helpful.
- postmortem Aug 19, 2025, 06:54 PM UTC
A postmortem for this incident and its related incidents has been posted [here](https://status.coveralls.io/incidents/wqbsxnzv0jsf): * [**Postmortem: Reports of “Website under heavy load” errors**](https://status.coveralls.io/incidents/wqbsxnzv0jsf) **Related incidents** 1. **Aug 13**: [Intermittent request rejections](https://status.coveralls.io/incidents/1n7plxrj8j44) 2. **Aug 14**: [Service unavailable with HTML error page or 500 errors](https://status.coveralls.io/incidents/v5mcbrsbhgt4) 3. **Aug 18**: [Reports of "Website under heavy load" errors](https://status.coveralls.io/incidents/fr6sp5kyn128) 4. **Aug 19 \(Today\)**: [Reports of "Website under heavy load" errors](https://status.coveralls.io/incidents/wqbsxnzv0jsf)
Elevated Latency — Resolved
Timeline · 1 update
- resolved Aug 08, 2025, 07:34 PM UTC
We experienced a period of elevated latency due to overnight traffic for about 6 hours today, Fri, Aug 8, from about 1am-7am PDT. We paused a couple of outlier repos, scaled servers to clear a backlog, and resolved ongoing latency. It took approximately 30-min to clear our remaining backlog of background jobs. We'd also like to take this opportunity to address a question we get, which is: Q: "If the Dequeue Time Graph is missing from the Status Page, does that mean processing has stopped?" A: No. While a missing graph can indicate a problem, it’s often isolated to the sampling process itself—either the cron job that collects the data or the single server we currently use for sampling. This “flagship” server runs additional packages and handles control and reporting tasks beyond normal app serving. At times, those extra duties can overload it, causing sampling failures without affecting overall build processing. We plan to broaden our sampling to multiple servers in the future.
Elevated Latency for some users
Timeline · 3 updates
- identified Jun 17, 2025, 01:34 PM UTC
The issue has been identified and a fix is being implemented.
- monitoring Jun 17, 2025, 06:10 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Jun 18, 2025, 03:37 PM UTC
This incident has been resolved.
Delayed Coverage Calculations for Some Users
Timeline · 3 updates
- monitoring Jun 03, 2025, 03:33 PM UTC
We are continuing to clear a backlog of background processing jobs for builds submitted in the past 18-24 hrs. While all systems are operational, there will continue to be latency on build times until we clear all background job queues, which are FIFO. Current estimate: 1-hour. We will post updates here until build times return to normal.
- monitoring Jun 03, 2025, 05:34 PM UTC
We took the following action to more quickly restore normal build times for all _new_ builds today: - We moved all unfinished background jobs from yesterday (Jun 2) into holding queues in order to restore normal build times for new builds from today (Jun 3). - We scaled resources to more quickly drain the existing queues of jobs from new builds from today (Jun 3). We will monitor progress on all new builds and provide updates here until we're fully caught up (zero (0) background jobs in queue). Thanks for your patience in the meantime as we restore the best possible performance to the service.
- resolved Jun 04, 2025, 04:11 AM UTC
All builds from today (Jun 3) have been processed. As background job queues cleared, build times returned to normal. We will continue monitoring.