Felix incident
Intermittent access issues with the login page
Felix experienced a critical incident on June 25, 2023 affecting Contractor Portal and Vendor Portal, lasting 4h 45m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Jun 25, 2023, 10:46 PM UTC
We are currently experiencing intermittent access issues with our login page, and our team is actively working to resolve the problem.
- investigating Jun 26, 2023, 12:20 AM UTC
We are continuing to investigate the platform access issues with our login page. Our team is working to resolve the problem as a priority.
- identified Jun 26, 2023, 01:06 AM UTC
We have now identified the probable root cause of the access issues with our login page. We are currently working on a solution to resolve the issue.
- monitoring Jun 26, 2023, 01:22 AM UTC
We have now restored access to the application and are monitoring the platform.
- resolved Jun 26, 2023, 03:31 AM UTC
This incident has been resolved.
- postmortem Jul 03, 2023, 05:11 AM UTC
On Monday, 26 June, 2023 the Felix platform experienced an unexpected widespread application outage for an extended period. The incident was triggered by modifications made to our code deployment script as part of infrastructure upgrade testing which took place on Friday, 23 June, 2023. The modifications made to our code resulted in an error with one of our primary application services when server auto-scaling occurred Monday morning. This had a cascading effect on other applications which relied on the affected primary application to operate, and ultimately resulted in widespread application access issues for users of Felix. Immediate steps were taken to correct the issue, ensuring that such errors will be prevented in the future. Additionally, we have introduced new monitoring steps to detect and report primary application health. We appreciate your patience during this incident and apologise for any inconvenience caused.