Courier Outage History
Courier is up right nowCourier had 31 outages in the last 2 years totaling 62h 11m of downtime — averaging 1.3 incidents per month.
There were 31 Courier outages since May 28, 2024 totaling 62h 11m of downtime. Each is summarised below — incident details, duration, and resolution information.
Observability Degradation
Timeline · 1 update
- resolved Jan 29, 2026, 11:13 PM UTC
The Courier team identified an issue affecting observability metrics between 21:38 UTC and 22:13 UTC, during which metrics briefly experienced an outage. A fix was released and metrics are stabilized through observability channels. Metrics received during the outage window will not be accounted for in observability dashboards.
Courier Web App Performance
Timeline · 2 updates
- identified Jan 14, 2026, 09:47 PM UTC
The Courier team was alerted of a small degredation in the Courier web application. We have identified the issues and have rolled out a fix to mitigate. The current features will encounter some latency: - Audience Calculation - Audit Trail logging We will follow up with updates.
- resolved Jan 15, 2026, 12:56 AM UTC
Fix is out and application is stable
All services impacted
Timeline · 4 updates
- identified Oct 20, 2025, 09:54 AM UTC
An issue with our service provider, AWS, has impacted all Courier services. The engineering team is working on resolution.
- identified Oct 20, 2025, 09:55 AM UTC
We are continuing to work on a fix for this issue.
- monitoring Oct 20, 2025, 10:02 AM UTC
Service has been restored. Engineering is working through system review and will continue to monitor.
- resolved Oct 20, 2025, 10:11 AM UTC
The incident has been resolved
Microsoft Teams Tenant Id Errors
Timeline · 2 updates
- monitoring Oct 17, 2025, 05:33 AM UTC
The Courier team was alerted of an issue with MS Teams requests experiencing errors sending. We have identified the issue and published a fix that will land in production shortly
- resolved Oct 17, 2025, 06:01 AM UTC
Release is in production and Microsoft Teams messages are passing successfully.
Mailgun Message Delivery Latency
Timeline · 1 update
- resolved Oct 13, 2025, 04:19 PM UTC
Courier experienced a slowdown in message delivery with Mailgun providers around 2AM PST. Our monitoring systems caught the spike in the bottleneck, which eventually stabilized. Our infrastructure team is actively monitoring.
Message Delivery Latency
Timeline · 1 update
- resolved Oct 08, 2025, 03:16 PM UTC
Courier's delivery platform experienced significant latency around 1am PST. Our platform team managed to resolve the issue and is actively monitoring for any potential issues.
Webapp HTTP Interaction Issues
Timeline · 2 updates
- identified Jul 07, 2025, 09:10 PM UTC
An issue with logging in to the Courier platform and overall app performance was identified by our team and a revert of the offending release has been deployed. The estimated time for the release to land is 45 minutes.
- resolved Jul 07, 2025, 09:28 PM UTC
Revert has landed and webapp is operational.
Cloudflare Outage Possible Okta SSO Impact
Timeline · 2 updates
- monitoring Jun 12, 2025, 07:30 PM UTC
Cloudflare has reported an outage that will impact SSO features provided by WorkOS. Customers with Okta SSO might experience some degradation when signing in. Our team is monitoring the status for both Cloudflare and WorkOS. WorkOS status: https://status.workos.com/incidents/k9s870cktcsf Cloudflare Status: https://www.cloudflarestatus.com/incidents/25r9t0vz99rp
- resolved Jun 12, 2025, 08:36 PM UTC
WorkOS systems are back to normal.
Message Delays on Send Pipeline
Timeline · 5 updates
- investigating Jun 05, 2025, 06:51 PM UTC
The Courier team has identified an issue where there are significant delays of up to 10 minutes for messages. We are currently investigating and looking for a root cause to mitigate. Updates will follow.
- identified Jun 05, 2025, 07:04 PM UTC
Backed-up messages were caused by a bottleneck in our system, which has started to catch up. Messages will stabilize as the bottleneck clears.
- monitoring Jun 05, 2025, 07:14 PM UTC
Message bottleneck has cleared and the queue has stabilized. We will continue to monitor.
- monitoring Jun 05, 2025, 07:19 PM UTC
Messages have stabilized but we will continue to monitor.
- resolved Jun 05, 2025, 08:08 PM UTC
Messages have recovered and pipeline is stable again.
EU Region – Notifications Stuck in “Queued” State
Timeline · 3 updates
- investigating May 16, 2025, 02:28 PM UTC
We are currently investigating an issue where EU workspace notifications appear to remain in the “queued” state.
- identified May 16, 2025, 02:53 PM UTC
The issue has been identified and a fix is being implemented.
- resolved May 16, 2025, 09:02 PM UTC
The issue has been resolved. The root cause was purely a logging issue, notifications were still sent but showing up in logs as queued. All logs have been rehydrated
AWS SES Failures
Timeline · 5 updates
- identified Apr 30, 2025, 10:39 PM UTC
The Courier team has identified an issue with a recent deployment that is impacting AWS SES integrations. We have found the root cause and have reverted the changes. The team is gathering impact.
- identified Apr 30, 2025, 10:47 PM UTC
The reverted changes will land in production in around one hour. The team will investigate the impact and handle message retries accordingly.
- monitoring Apr 30, 2025, 11:26 PM UTC
Fix has landed and the team is monitoring progress. Messages sent to AWS SES are sending normally. The team will reprocess impacted messages.
- monitoring May 01, 2025, 01:06 AM UTC
The team is still working through the reprocessing for impacted messages. Confirmation of all impacted messages reprocessed will be updated here.
- resolved May 01, 2025, 04:58 AM UTC
All impacted messages have been reprocessed.
Internal Courier Error Issue
Timeline · 7 updates
- identified Feb 28, 2025, 04:48 PM UTC
The Courier team has identified an issue where messages fail due to an "Internal Courier Error" in the test environment. Our team found the root cause and are working to mitigate the issue.
- identified Feb 28, 2025, 05:14 PM UTC
We have tested the fix for "Internal Courier Error" and will be releasing it to production soon.
- identified Feb 28, 2025, 05:36 PM UTC
A fix has been merged and will fully deploy within the next hour. Updates to follow.
- monitoring Feb 28, 2025, 06:40 PM UTC
The fix has landed in production and we will be monitoring for any more "Internal Courier Errors". We will be working on reprocessing any dropped messages caused by this issue.
- monitoring Feb 28, 2025, 06:57 PM UTC
Messages with tags are no longer impacted, and error has been resolved. The team is working on reprocessing dropped messages.
- monitoring Feb 28, 2025, 07:52 PM UTC
The team is still working on reprocessing dropped messages caused by this issue. Send pipeline is operational.
- resolved Mar 01, 2025, 04:11 AM UTC
All impacted messages have been reprocessed.
Segment Event Processing Slowdowns
Timeline · 2 updates
- monitoring Feb 21, 2025, 03:55 PM UTC
The Courier team has identified a slowdown in Segment event processing. There will be a slight delay for messages triggered by incoming Segment triggers. We estimate the queue to normalize soon and are monitoring.
- resolved Feb 21, 2025, 05:10 PM UTC
Segment event processing stream has resolved.
Significant Notification Event Latency
Timeline · 3 updates
- identified Feb 18, 2025, 02:22 PM UTC
The Courier team identified a delay in message processing and event updating. The pipeline queue for message delivery has normalized, however significant slowdowns in message events are still processing.
- monitoring Feb 18, 2025, 02:29 PM UTC
We're monitoring our processing pipeline as messages are flowing normally. There is a slight delay in event processing which will also impact outbound webhooks.
- resolved Feb 18, 2025, 02:47 PM UTC
Message sends are flowing normally, and events are caught up without delays.
Courier Send and Message Event Slowdown
Timeline · 3 updates
- identified Feb 13, 2025, 10:32 PM UTC
The Courier team has identified a slowdown in sending notifications. This issue has been identified by our team and is closely monitoring. We do not expect any messages to drop. We will continue to monitor and update accordingly.
- monitoring Feb 13, 2025, 10:51 PM UTC
Our team has identified a bottleneck in our system caused by timeouts on our sendworker. Our exponential backoff has caught up and messages are starting to go through.
- resolved Feb 13, 2025, 11:09 PM UTC
The team has identified the issue related to network connections on AWS which resulted in a backup of messages on our pipeline. Messages are flowing through as expected, and the stream has caught up.
Delay in Message Sends and Processing
Timeline · 3 updates
- identified Feb 11, 2025, 10:15 PM UTC
The Courier team has identified a slowdown in our pipeline. The queue has backed up but no messages have been dropped at the moment. We will continue to update as we monitor our pipeline.
- monitoring Feb 11, 2025, 10:34 PM UTC
The team has identified the cause of message sends and slowdowns to be related to an AWS service. Messages that were stuck in a queued state are slowly passing through. The team will continue to monitor and update.
- resolved Feb 11, 2025, 11:08 PM UTC
The send stream has caught up, all messages have been sent and the issue appears to be fully resolved.
Automation Delay Processing Issues
Timeline · 3 updates
- identified Jan 11, 2025, 01:23 AM UTC
We are currently experiencing an issue affecting automation workflows that include delay steps. Some customers may encounter failures when attempting to execute automations with scheduled delays. Our engineering team has identified the root cause and is implementing a fix. We expect the service to be fully restored in a short while.
- monitoring Jan 11, 2025, 02:19 AM UTC
We have deployed a fix for the earlier issue affecting automation workflows with delay steps. New automation workflows are now executing successfully. However, automations that failed during the incident period (approximately 3:20 PM PT - 6:05 PM PT) have not yet been automatically retried. Our engineering team is monitoring the recovery and evaluating a plan to process these backlogged automations.
- resolved Jan 11, 2025, 04:25 AM UTC
The service issue affecting automation workflows with delay steps has been partially resolved. New automation workflows started after 6:05 PM PT are executing successfully. However, we have determined that automations that failed during the incident period (approximately 3:20 PM PT - 6:05 PM PT) cannot be automatically retried at this time due to technical limitations. If your business was impacted by failed automation runs during this incident, please contact our support team and we will work directly with you to address your concerns. We sincerely apologize for this disruption to your workflows.
Delayed Message Processing
Timeline · 8 updates
- investigating Oct 24, 2024, 10:38 PM UTC
Courier has identified a delay in message processing for the messages API. We are currently investigating the root cause and will update you periodically.
- investigating Oct 24, 2024, 11:01 PM UTC
We are continuing to monitor the root cause of the delayed message processing.
- investigating Oct 24, 2024, 11:29 PM UTC
The team is still investigating and discussing the root cause internally. Messages are experiencing longer than normal processing times.
- identified Oct 25, 2024, 12:00 AM UTC
The team has identified the issue and is rolling out a fix for the message delays.
- identified Oct 25, 2024, 12:40 AM UTC
Fix has been deployed and will be live in around 1hr.
- identified Oct 25, 2024, 01:31 AM UTC
The team identified an issue with the latest fix and reverted it. A new fix has been published to mitigate the message delay issue. ETA to land in production ~45 minutes.
- monitoring Oct 25, 2024, 02:43 AM UTC
A fix has been implemented and delivery times are beginning to returning to expected levels. The team will continue to monitor.
- resolved Oct 25, 2024, 02:48 AM UTC
This incident has been resolved
Message Logs Delays
Timeline · 6 updates
- investigating Oct 21, 2024, 03:56 PM UTC
The Courier team is investigating an issue with the event logger for message event logs hitting a bottleneck. The team is actively investigating. Messages are still sending.
- investigating Oct 21, 2024, 03:56 PM UTC
We are continuing to investigate this issue.
- identified Oct 21, 2024, 04:07 PM UTC
The team is testing out a fix to reduce the bottlenecked log lines before releasing to production.
- identified Oct 21, 2024, 05:19 PM UTC
The team encountered an issue with testing the fix and reverted the update. We are publishing a new update that should resolve the backlogged message logs.
- monitoring Oct 21, 2024, 06:30 PM UTC
The fix has landed in production and the team is monitoring the message log queue. Message event logs should be flowing normally.
- resolved Oct 21, 2024, 06:51 PM UTC
The data stream is unblocked, and the message logs queue is resolved and flowing normally.
Courier Inbox FetchMessages Connectivity Issues
Timeline · 8 updates
- investigating Oct 10, 2024, 03:38 PM UTC
The Courier Team is investigating an issue related to Inbox fetching messages in the component.
- identified Oct 10, 2024, 04:30 PM UTC
The team has identified the issue and will be rolling out a fix. The team will reprocess the impacted messages that were not fetched.
- monitoring Oct 10, 2024, 04:46 PM UTC
Inbox connectivity for fetching messages has been reestablished. The team is monitoring closely, and working on retroactively processing impacted messages to Inbox
- monitoring Oct 10, 2024, 05:50 PM UTC
We are continuing to gather a list of impacted messages to backfill impacted Inbox components.
- monitoring Oct 10, 2024, 06:06 PM UTC
We are continuing to monitor for any further issues.
- monitoring Oct 10, 2024, 07:07 PM UTC
The team has gathered impacted messages and is running tests before a release.
- monitoring Oct 10, 2024, 08:51 PM UTC
The team has initiated backlogging the impacted Inbox messages.
- resolved Oct 11, 2024, 02:33 AM UTC
Inbox connectivity issues resolved.
Send Pipeline and Event Status Slowdowns
Timeline · 5 updates
- identified Sep 06, 2024, 04:35 PM UTC
The Courier team has identified an issue with the send pipeline and event status updates causing queued messages and delayed webhook events. The Courier team will monitor the bottleneck and adjust message batching necessary to flow messages normally.
- identified Sep 06, 2024, 04:39 PM UTC
The team has increased our processing to help with the bottleneck.
- monitoring Sep 06, 2024, 04:44 PM UTC
The team has increased the send pipeline worker and messages are clearing the bottleneck.
- monitoring Sep 06, 2024, 04:52 PM UTC
Corrective actions have cleared the bottleneck and messages and events should be flowing normally.
- resolved Sep 06, 2024, 04:59 PM UTC
Bottleneck has cleared for messages and event statuses.
Delay in Message Processing
Timeline · 6 updates
- identified Aug 19, 2024, 06:32 PM UTC
The Courier team identified an issue in our health monitoring involving our message event processing. The issue has been identified and a revert is in place.
- monitoring Aug 19, 2024, 06:48 PM UTC
Our team has released a revert to address the regression and it's in the process of merging.
- monitoring Aug 19, 2024, 06:50 PM UTC
The release is live, and the team is monitoring it.
- monitoring Aug 19, 2024, 06:56 PM UTC
Release is published and building to production. ETA ~45 minutes.
- monitoring Aug 19, 2024, 08:02 PM UTC
Fix has been deployed, and enqueued messages have started to go through slowly. Once the bottleneck clears, messages should start to flow normally.
- resolved Aug 19, 2024, 09:35 PM UTC
The general pipeline has recovered.
Automation Service Degrated
Timeline · 3 updates
- investigating Jul 29, 2024, 10:03 PM UTC
The Courier team has identified an issue impacting Automation services, which has resulted in degraded performance. An underlying issue was identified at around 10:30 PST and a fix was released at 11:00 PST. We are monitoring the automation worker for any leads.
- monitoring Jul 29, 2024, 10:43 PM UTC
The team has identified the issue and is closely monitoring. Failed steps will continue to be retried with exponential backoff. Automations should recover, and all affected automations should execute after a delay of up to 15 minutes once the problem is resolved
- resolved Jul 29, 2024, 11:07 PM UTC
Automations have stabilized.
Delay in Event Statuses Processing
Timeline · 2 updates
- identified Jul 26, 2024, 03:29 PM UTC
The team has identified an issue where event statuses are delayed. The team has identified the root cause and is mitigating a fix.
- resolved Jul 26, 2024, 03:31 PM UTC
Issue was resolved and events are flowing normally.