API Service Disruption
Timeline · 1 update
- resolved May 04, 2026, 04:50 PM UTC
The issue affecting our API services has been resolved. All systems are now operating normally. We apologize for any inconvenience this may have caused.
Writer had 15 outages in the last 2 years totaling 24790h 30m of downtime — averaging 0.6 incidents per month.
There were 15 Writer outages since June 12, 2025 totaling 24790h 30m of downtime. Each is summarised below — incident details, duration, and resolution information.
The issue affecting our API services has been resolved. All systems are now operating normally. We apologize for any inconvenience this may have caused.
The issue affecting WRITER Agent has been resolved. All systems are now operating normally. We apologize for any inconvenience this may have caused.
In WA after authenticating and selecting the connector to retrieve context some users but not all are prompted to configure the connector in a loop until it fails to content.
TemplateHighP95Latency
On 2026-03-10, Writer Agent experienced a SEV1 full outage driven by database lock contention during deployment. An in-band schema migration waited on long-lived idle-in-transaction sessions, causing query queueing on core.threads and rapid connection pool exhaustion across pods. Service recovered after terminating blocking sessions and stabilizing rollout; follow-up actions are in progress to harden migration safety and transaction lifecycle controls.
Envoy pod q7fjq tripped its default ext_authz circuit breaker (max_connections: 1,024) after absorbing 57% of traffic due to a rollout without readiness probes, then getting hit by a GKE node removal reconnection storm. The circuit breaker locked for 17 minutes, returning HTTP 500 for ~62% of auth requests on that pod. 9.4% of all user-facing requests failed (5,203 of 55,449). The system self-recovered when the auth connection latency tail cleared naturally. The other two envoy pods were completely healthy throughout. A skynet-frontend restart at 17:41:15 was coincidental (old pods still alive when CB cleared at 17:42:00).
Palmyra vision is having issues when serviced through public api requests.
Outage on Baseten embedding models
Writer experienced a service disruption on from 2:30 PM–2:40 PM PT (22:30–22:40 UTC), during which users received 403 errors. This was caused by planned maintenance that temporarily affected inbound traffic. Service has been restored.
Feb 11 , 21:13 UTC Resolved - Writer experienced a temporary disruption in service between 12:07pm PT (20:07 UTC) and 12:24pm PT (20:24 UTC), where some users experienced inability to load the AI Studio and Writer Agent functionality. The service was restored and is fully functional.
Dec 11 , 16:13 UTC Resolved - We have confirmed the incident is now resolved and service has been restored. Dec 11 , 15:49 UTC Monitoring - A fix has been rolled out and service has been restored. We are continuing to monitor to insure full remediation. Dec 11 , 15:34 UTC Investigating - We are aware of an issue in accessing Writer Agent functionality. The team is investigating on priority now.
Dec 8 , 23:06 UTC Resolved - We have confirmed that functionality has been restored. Dec 8 , 22:27 UTC Monitoring - A fix has been implemented and we are monitoring to insure successful resolution. Dec 8 , 21:59 UTC Update - We have identified an issue and are currently testing a fix to remediate the AI Studio No Code and Agent Builder errors. Dec 8 , 21:24 UTC Investigating - We are currently investigating an issue where errors are receiving error messages when requesting content in AI Studio.
Oct 21 , 00:26 UTC Resolved - AWS has updated the incident with details indicating that the issue has resolved. We do not see further impact resulting from the outage at this time. We will continue to monitor the situation and follow with AWS. Oct 20 , 23:37 UTC Update - We are continuing to monitor the issue related to our public API and Agent Builder functionality. Oct 20 , 21:54 UTC Identified - We have seen a recent update to our API processing sub-service that is currently causing an elevated rate of errors on the API. This is affecting our public-facing APIs and Agent Builder product. Oct 20 , 10:25 UTC Investigating - Please note that we are aware of the AWS outage affecting multiple services. We have not seen an impact to customer systems at this time but are continuing to monitor.