Section incident
Errors delivered for a subset of requests in Sydney region
Section experienced a major incident on March 29, 2021 affecting Sydney, lasting 28m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Mar 29, 2021, 12:47 AM UTC
We are currently investigating this issue.
- identified Mar 29, 2021, 12:56 AM UTC
The issue has been identified and a fix is being implemented.
- monitoring Mar 29, 2021, 01:04 AM UTC
A fix has been implemented and we are monitoring the results.
- identified Mar 29, 2021, 01:09 AM UTC
The issue has been identified and a fix is being implemented.
- monitoring Mar 29, 2021, 01:13 AM UTC
A fix has been implemented and we are monitoring the results.
- resolved Mar 29, 2021, 03:43 AM UTC
This incident has been resolved. Error rates return to normal and affected PoPs are returned to service
- postmortem Mar 30, 2021, 08:03 PM UTC
**Incident Root Cause and Corrective Summary** Network connectivity was impacted between Section’s Sydney PoPs and AWS/Azure services. This resulted in errors being served for cache misses for customers with AWS and Azure hosted origins. As a result, the affected PoPs were removed from Section's delivery network and the platform started directing customer traffic away from the impacted PoPs. The problematic route between Sydney and AWS and Azure networks was failing intermittently which allowed the failure rates to stay beneath the threshold for automatic route removal. Once the unhealthy route impacting connectivity to AWS/Azure networks was disabled, the affected PoPs were returned to service and traffic was restored.