Octopus incident
Some Octopus Cloud customers affected by missing variable substitution on Helm template sources
Octopus experienced a minor incident on January 24, 2025 affecting Octopus Cloud, lasting 3d 20h. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Jan 24, 2025, 01:35 AM UTC
We have had reports of and have identified an issue with variable substitution with Helm template sources. We are tracking the fix here: https://github.com/OctopusDeploy/Issues/issues/9224
- monitoring Jan 24, 2025, 08:58 AM UTC
We have fixed the issue, but due to timing and other factors will not be rolling this out immediately. If you are affected by this issue, please get in touch with [email protected] who can manually update your instance.
- resolved Jan 27, 2025, 10:21 PM UTC
This issue has been resolved and will be rolling out to Octopus Cloud customers in the next couple of days. If you are affected by this issue, please get in touch with [email protected]
- postmortem Feb 05, 2025, 10:30 PM UTC
# Report and learnings: Missing variable substitution on Helm template sources ###### Author: Kevin Tchang # Summary The Octopus Server `2025.1.6849` update introduced a bug in the _Deploy a Helm Chart_ step, preventing variable substitution from working correctly in certain Helm template value sources. This led to Helm deployment failures for Cloud customers, which had previously succeeded without issue. # Background [Calamari](https://octopus.com/docs/octopus-rest-api/calamari) is a deployment tool used by Octopus to execute deployment tasks on target machines, such as extracting packages and running scripts. It supports the deployment of many built-in Octopus steps, including the _Deploy a Helm Chart_ step, which allows users to deploy Helm charts to Kubernetes clusters. When deploying Helm charts, Octopus allows users to pass values into the Helm release via **Helm Template Value Sources** \(Helm TVS\). These values can come from various sources, including charts, packages, Git repositories, key-values, or inline YAML. The Helm TVS are sent to Calamari during deployment to configure the Helm chart correctly. **Octopus variable substitution** allows dynamic replacement of placeholder variables within deployment configurations. This enables users to store sensitive data or environment-specific values as variables in Octopus and substitute them during deployment. ## Incident timeline _\(All dates and times below are shown in UTC\)_ ##### 23/1/2025 – 18:49 \(5:49 AEDT\) We receive the first reports of Helm deployments failing due to missing variable substitutions. ##### 23/1/2025 – 23:39 \(10:39 AEDT\) Our support team escalated the issue to our engineering teams. ##### 24/1/2025 – 1:35 \(12:35 AEDT\) Our internal incident response process was initiated. ##### 24/1/2025 – 8:58 \(19:58 AEDT\) The fix for the bug is merged `2025.1.7389`, and our Status Page is updated to _Monitoring_. However, due to the timing of a recent Octopus Server upgrade to .NET 9 and concerns about stability, the fix was not immediately rolled out to all Cloud customers. In the meantime, our support team provided assistance with manual upgrades. ##### 27/1/2025 – 22:21 \(9:21 AEDT\) After the long weekend \(due to a public holiday\), the Status Page was updated to _Resolved_ once the fix was confirmed by customers. ## Technical details Helm TVS are transmitted to Calamari as strings within a JSON array structure. Upon receiving these, Calamari parses the JSON array into their respective TVS types \(such as package, chart, inline-YAML, etc.\), which are then used by the `helm upgrade` command. The bug was inadvertently introduced during a recent change aimed at eliminating the need for escaping quotes when using Octopus variable values in Helm TVS. Prior to this change, variable values with unescaped quotes caused parsing errors because the quotes were misinterpreted when processed from the JSON array structure. The change modified the process so that variable substitution occurs after the JSON array is parsed into its respective TVS types, rather than before. However, this substitution was initially applied only to the **inline YAML TVS type**, and not the other four TVS types, which also could contain variables to substitute. This resulted in a regression, and the fix involved applying variable substitution to all TVS types and ensuring it was handled correctly wherever a variable could appear. ## Remediation and next steps At Octopus, ensuring deployment reliability is a top priority. Following this incident, we conducted a comprehensive review to identify areas for improvement. Given the flexibility and complexity of our product features, designing a robust testing process for all scenarios can be challenging. However, we’re taking proactive steps to improve how we test variable evaluation features.