Tyk incident
Control Plane Redis storage instability in us-east-1
Tyk experienced a major incident on January 3, 2024 affecting Controller at aws-us-east-1, lasting 1d 3h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jan 03, 2024, 12:37 PM UTC
Our monitoring has alerted about an increase in storage-related errors in the Redis Clusters in the aws-us-east-1 zone. This has been caused by a rolling of the storage nodes, the mitigation procedure is being identified and put in action. Please avoid re-deploying or restarting deployments in this zone until the incident is resolved.
- identified Jan 03, 2024, 05:24 PM UTC
The cause of the storage instability is identified and there's no risk of re-occurrence currently. Meanwhile, mitigation procedures for the remaining deployments to prevent future data loss are in progress.
- identified Jan 03, 2024, 10:55 PM UTC
Initial mitigation is now complete. The SRE team will keep working to make sure every deployment has full redundancy. We will post an update once this is complete.
- resolved Jan 04, 2024, 04:14 PM UTC
Functionality and normal redundancy is now restored.