Immuta experienced a major incident on June 13, 2023, lasting 3h 22m. The incident has been resolved; the full update timeline is below.
Update timeline
- investigating Jun 13, 2023, 07:57 PM UTC
We want to inform you that the Immuta SaaS platform is currently experiencing a partial outage due to a performance-impacting event in the AWS IAM service. While some users may not be impacted, others may experience degraded performance or intermittent connectivity issues. Our team is actively working to resolve the issue and restore service to a fully operational state. We apologize for any inconvenience this may cause and appreciate your patience during this time. If you have any questions regarding this issue, please contact our Immuta Customer Support team by opening a ticket at https://support.immuta.com or in the event of an emergency, sending an email to [email protected].
- investigating Jun 13, 2023, 08:53 PM UTC
We are continuing to monitor this situation. Amazon has reported back that services are being restored and should be fully operational soon.
- monitoring Jun 13, 2023, 09:08 PM UTC
Amazon has indicated that services have been restored and we are no longer seeing errors in our logging. We will continue to monitor this situation to ensure all services are working as expected.
- monitoring Jun 13, 2023, 09:08 PM UTC
We are continuing to monitor for any further issues.
- resolved Jun 13, 2023, 11:19 PM UTC
This incident has been resolved.
- postmortem Jun 14, 2023, 11:21 AM UTC
Beginning at 18:49 UTC on June 13, 2023 we started receiving sporadic error messages from the AWS \(Amazon Web Services\) platform - many which would not effect the Immuta platform. As time progressed, more AWS systems failed - causing the potential for errors to propagate into the Immuta solution - several customers in the North America region were impacted. Amazon released an RCA for the situation: > _We quickly narrowed down the **root cause to be an issue with a subsystem responsible for capacity management for AWS Lambda,** which caused errors directly for customers \(including through API Gateway\) and indirectly through the use by other AWS services._ As AWS was able to restore service, we monitored the systems to ensure service was restored.