`helm upgrade --install` doesn't perform an install/upgrade if the first ever install fails #3353

jstrachan · 2018-01-17T11:54:53Z

Using helm upgrade --install is a nice way to install or upgrade depending on if the release exists. But it looks like there's a bug in the logic; its not handling failed installs. In my case the first install failed; then a subsequent attempt wasn't even made as it crashes out immediately.

Maybe if the last release failed then helm upgrade --install should delete it and install again?

$ helm list
NAME           	REVISION	UPDATED                 	STATUS  	CHART                         	NAMESPACE
foo            	2       	Wed Jan 17 11:48:08 2018	FAILED	something-0.0.1                     	default

$ helm upgrade "foo" . --install 
Error: UPGRADE FAILED: "foo" has no deployed releases

The text was updated successfully, but these errors were encountered:

bacongobbler · 2018-01-17T16:50:44Z

This was intentional by design of #3097. Basically, diffing against a failed deployment caused undesirable behaviour, most notably this long list of bugs:

If your initial release ends up in a failed state, we recommend purging the release via helm delete --purge foo and trying again. After a successful initial release, any subsequent failed releases will be ignored, and helm will do a diff against the last known successful release.

Now that being said, it might be valuable to not perform a diff when no successful releases have been deployed. The experience would be the same as if the user ran helm install for the very first time in the sense that there would be no "current" release to diff against. I'd be a little concerned about certain edge cases though. @adamreese do you have any opinions on this one?

chancez · 2018-01-22T22:35:19Z

The suggested fix seems completely untenable in an automated system. I definitely don't want everything invoking helm to have to know about "if first release fails, delete and retry". For one, most of my tooling isn't aware if it's an install or upgrade, or if it's the first time or 100th time, it's almost always just running helm upgrade --install.

chancez · 2018-01-22T22:36:08Z

I'd also like to call out that I commented on the original PR #3097 (comment) asking specifically about this case.

stealthybox · 2018-01-25T06:35:33Z

The old behavior was better for this case.
I agree with @chancez. This makes upgrade --install non-idempotent for a common occurrence.

@bacongobbler
If we're worried about releases failing and leaving shrapnel due to failed hooks, I'd say that's a design issue the chart. ( Hooks work better when they are idempotent )
Users are free to build error handling and non-idempotent behavior around helm.

What other edge-cases are we concerned about?
Seems the #3097 takes care of a lot 👍

bmperrea · 2018-01-31T20:28:34Z

My local development would go much smoother if I could make helm upgrade -i be idempotent even against Failed releases for at least some combination of arguments. My use case is when I have a script of many releases that I know I want to get up to start a local development env.

This might be analogous to the --replace flag for helm install. Note that --replace is one of only two flags from helm install that is missing in helm upgrade, the other being --name-template .

bacongobbler · 2018-02-01T08:59:11Z

To be absolutely clear, yes this would be a good thing to fix. Anyone wanna take a crack at it while we've got our hands full with other work?

If the first `upgrade --install` results in a state FAILED, you can not run the same command `upgrade --install` again without a failure. This happens becuase we are search only for releases with the status DEPLOYED. This change will if the search for DEPLOYED fails, then try to search for a release with the state FAILED, and if found upgrade that. This fixes issue helm#3353

sorenmat · 2018-02-01T20:13:23Z

Hi,
I've created a PR #3437 that should fix this issue

whereisaaron · 2018-02-02T06:23:47Z

I am not sure why we need the install and upgrade commands, I only ever use the upgrade --install command and is seems like a lot of people do the same. I just need one command that does upgrade --install and doesn't trip over a failed run. Can we just rename upgrade --install to deploy, make it truly idempotent, and ditch the other two?

(I'm struggling with a new variant this problem behavior in 2.8.0. Since upgrading from 2.7.2 now if I have a failed install, and then delete --purge it, and the upgrade --install it, I can still get the Error: UPGRADE FAILED: "xyz" has no deployed releases error. Seems like --purge isn't full effective in 2.8.0 and tiller has some stuck state not showing in list --all. I have to then to a install to get tiller back to a state where I can do the usual upgrade --install again.)

sorenmat · 2018-02-02T07:11:51Z

I agree with @whereisaaron, I would be nice with a deploy command that worked more like kubectl apply. Makes automation of Helm much easier too, since you don't have to check for releases existing in some shell script madness :)

If the first `upgrade --install` results in a state FAILED, you can not run the same command `upgrade --install` again without a failure. This happens becuase we are search only for releases with the status DEPLOYED. This change will if the search for DEPLOYED fails, then try to search for a release with the state FAILED, and if found upgrade that. This fixes issue helm#3353

rchernobelskiy · 2018-02-15T20:52:50Z

Perhaps the solution is to have helm automatically run helm delete --purge?
Something like:

User executes helm upgrade --install
First release fails
User makes some changes to chart and executes again helm upgrade --install
Helm tries to run the command
It fails and there is precisely one prior release in failed state
Helm silently executes helm delete --purge
After purge, Helm auto-retries helm upgrade --install and shows output from that

Perhaps this behavior could be triggered via the --force flag which already has similar behavior for other scenarios

bacongobbler · 2018-02-15T21:07:15Z

Good idea, but I don't think we should ever delete the release ledger without the user explicitly asking to remove that data. Operators of Helm will want to learn why the service failed to upgrade from previously failed releases, or deduce failures by collecting that data from the ledger.

I provided a comment earlier in the thread that describes a solution to the issue. It's similar to your solution in execution, but without the need to delete the entire release ledger. I believe #3437 is attempting to apply that solution as a patch.

gmanolache · 2018-04-27T14:30:38Z

@rchernobelskiy happens to me as well. Exactly as you describe.

I run into this issue maybe once per day when deploying new apps.
It's a pain!

stealthybox · 2018-04-27T17:40:34Z

@gmanolache We're still on helm 2.7.0 for this reason.
It's unclear to me whether upgrading to use the --force flag is safe: comment

If you need to downgrade, here's a good way to do it: downgrade to 2.7.0

whereisaaron · 2018-04-28T05:38:09Z

What is this useful sounding 'helm ledger' diagnostic info and how do we get to it? 😄

I'm worried the below might be read as moody, it is genuinely just an invitation to for pointers on how we can get diagnostic info when you have a failed deploy. Because it really sounds like I'm missing something. It sounds like the failed state is supposed to have some utility for operators? I trawled through the helm manual site again; will something like 'helm get manifest' work in a failed state to extract useful diagnostic info?

My user experience when I get a failed deployment is you get no useful info. Helm disowns all the partially created/remaining resources such that 'helm status' doesn't show anything. All you can do is 'rollback' or 'delete --purge' (you can't just 'delete' or your CI 'upgrade --install' will keep failing). The failed state only seems to serve to break the idempotency of 'upgrade --install' that we all crave for our CI deployments.

Would it be reasonable to have an '--auto-rollback' option for CI situations, e.g. 'upgrade --install --auto-rollback'. I'd usually rather a roll back that have to get out of bed to deal with a failed state 😆 😴 💤

bacongobbler · 2018-04-28T05:43:56Z

What is this useful sounding 'helm ledger' diagnostic info and how do we get to it? 😄

helm help history

whereisaaron · 2018-04-29T03:12:24Z

Thanks @bacongobbler. Ok, I understand that list is what is meant by the ledger. And if you still have the ledger, that you can use helm get manifest --revision 123 to see what was deployed that failed? That is certainly useful to preserve. And if we rollback we don't lose that information.

History prints historical revisions for a given release.

A default maximum of 256 revisions will be returned. Setting '--max'
configures the maximum length of the revision list returned.

The historical release set is printed as a formatted table, e.g:

    $ helm history angry-bird --max=4
    REVISION   UPDATED                      STATUS           CHART        DESCRIPTION
    1           Mon Oct 3 10:15:13 2016     SUPERSEDED      alpine-0.1.0  Initial install
    2           Mon Oct 3 10:15:13 2016     SUPERSEDED      alpine-0.1.0  Upgraded successfully
    3           Mon Oct 3 10:15:13 2016     SUPERSEDED      alpine-0.1.0  Rolled back to 2
    4           Mon Oct 3 10:15:13 2016     DEPLOYED        alpine-0.1.0  Upgraded successfully

If we had helm upgrade --install --auto-rollback then both the failed deployment the rollback would be recorded in the ledger and available to operators. And that would go a long way to preventing CI deployments getting to the intractable 'failed' state where 'helm upgrade --install' stops working. Failed CI deployments are usually developers injecting typos/mistakes into the deployment system. With '--auto-rollback' They can inspect the helm command error message retained in the deployment server log, fix and deployed corrected values.

I guess even without the '--auto-rollback' option we could use a wrapper automate to run helm rollback any time helm update --install returns an 'FAILED' error. And maybe detect where is it the initial install, and helm delete --purge instead in those cases.

That is, we could fashion a wrapper script to ensure the results of a CI 'helm upgrade --install' is always a state where the next CI 'helm upgrade --install' will always be possible. Whilst retaining the ledger information for any failed attempts (at least for releases whose initial install worked).

helm deploy =

helm upgrade --install
if FAIL then
- if revision=1
- then helm delete --purge
- else helm rollback

gmanolache · 2018-04-30T08:34:49Z

@whereisaaron that would be elegant 👍

IdanAdar · 2018-05-02T07:00:29Z

Is there an easy way to get the latest working release other than something like helm history ${name} | tail -2 | head -1 | awk '{print $1}', to be used by helm rollback?

RickS-C137 · 2019-01-23T10:42:58Z

Hello there,

I'm using Helm 2.12.2 and still have the issue, that helm fails, when the first deployment is failed. Is this a regression maybe?

eyalzek · 2019-01-30T14:20:29Z

I'm not sure it's a regression, but that it was never actually "fixed".

whereisaaron · 2019-02-04T02:10:22Z

@RickS-C137 I think this is supposed to be fixed by using helm upgrade --install --force which will 'delete' then 'install --replace' a failed release.

konokimo · 2019-02-13T20:13:45Z

Still trying to fix this issue in a Jenkins Pipeline I am trying to use.
I am trying to deploy a new image of my application and I couldn't care less if the deployment already exists or not.
I want to run one command that either replaces the current deployment or just installs it if it does not exist.
I tried helm install --replace I often get Error: a released named xyz is in use, cannot re-use a name that is still in use Which obviously kills my pipeline and the build fails.

walkafwalka · 2019-03-02T23:04:46Z

@bacongobbler What do you think about #3353 (comment)?

I do not see how there would be downtime or data loss if we destroy and recreate the initial release if the initial release fails.

walkafwalka · 2019-03-03T22:59:24Z

I implemented this in our build:

if helm history --max 1 "$name" 2>/dev/null | grep FAILED | cut -f1 | grep -q 1; then
    helm delete --purge "$name"
fi

helm upgrade --install --wait "$release" chart/

whereisaaron · 2019-03-04T03:53:43Z

With helm currently, you don't know which helm command+options combination to use without inspecting the current state. And for a given helm command you don't know what you are going to get, because it depends on what the current state is. That's not really the declarative desired state dream ☁️ 💤 😄

In helm 3 we can potentially deprecate install / upgrade / --replace / --upgrade / --force and replace them all with an idempotent helm deploy that either achieves the desired state, or leaves the state unchanged. Maybe using an algorithm similar to above, which if helm deploy fails, rolls back (revision > 1) or deletes+purges (revision = 1), to leave the state as it was before. The failed manifest would still be available via helm history/get. And there could even be a '--no-rollback' option for people who want to preserve the deployment in a failed state for investigation

The option of helm upgrade --install --force is getting close, except that rather than rolling back and upgrading, it deletes and replaces failed releases (even for revisions >1), which makes some people angry over on #3208... 😮 ⚡️ 💥

For right now we can use wrapper scripts or meta-tools like helmsman whose feature list is in part to employ helm but mitigate this issue:

Idempotency: As long your desired state file does not change, you can execute Helmsman several times and get the same result. [...regardless of the current state]
Continue from failures: In the case of partial deployment due to a specific chart deployment failure, fix your helm chart and execute Helmsman again without needing to rollback the partial successes first.

kylecordes · 2019-04-08T12:46:04Z

replace them all with an idempotent helm deploy that either achieves the desired state, or leaves the state unchanged

In retrospect, this is a breathtakingly obvious design goal.

dvlato · 2019-07-11T08:45:10Z

Hi,
In our case the initial release did not really fail... It's just either that our application was not completely up when the install timeout elapsed or some other strange issue that was fixed. In any case, the application is running perfectly fine, and thus having to delete it would be a problem for us (we have some persistent storage attached that would be also removed!!) .

Is there any workaround to deploy a chart when the initial release 'apparently failed' but it's actually ok?

schollii · 2020-03-23T12:01:07Z

So is the conclusion that upgrade --force is too forceful, ie there are times when a delete+replace+retry_upgrade is not correct remedy to failed upgrade?

dcow · 2020-07-07T08:49:54Z

Is there a separate issue tracking the idea of merging install & upgrade into a deploy command?

hickeyma · 2020-07-07T09:16:37Z

Not that I know of @dcow. What is the use case over helm upgrade --install command?

dcow · 2020-07-07T09:40:41Z

#3353 (comment)

I am not sure why we need the install and upgrade commands, I only ever use the upgrade --install command and is seems like a lot of people do the same. I just need one command that does upgrade --install and doesn't trip over a failed run. Can we just rename upgrade --install to deploy, make it truly idempotent, and ditch the other two?
...

and

#3353 (comment)

With helm currently, you don't know which helm command+options combination to use without inspecting the current state. And for a given helm command you don't know what you are going to get, because it depends on what the current state is. That's not really the declarative desired state dream cloud zzz smile

In helm 3 we can potentially deprecate install / upgrade / --replace / --upgrade / --force and replace them all with an idempotent helm deploy that either achieves the desired state, or leaves the state unchanged.
...

I generally agree that helm should work like kubectl apply and attempt to achieve the desired reality rather than needing to run different types of commands depending on the state of your cluster. Was hoping to add support to a dedicated issue if one existed or at least figure out what the resolution was since deploy is not currently implemented and we're on helm 3.2.

hickeyma · 2020-07-07T09:55:12Z

@dcow Ok, do you want to create an issue then with your proposal?

dcow · 2020-07-08T00:40:09Z

@hickeyma done #8415!

bacongobbler added the question/support label Jan 17, 2018

bacongobbler mentioned this issue Jan 31, 2018

helm upgrade requires deleting failed installs #3415

Closed

bacongobbler mentioned this issue Feb 1, 2018

helm upgrade --install no longer works #3208

Closed

sorenmat mentioned this issue Feb 1, 2018

fix upgrade of broken install #3437

Closed

bmperrea mentioned this issue Feb 2, 2018

"No release found" when first time installation fails #3429

Closed

bacongobbler mentioned this issue Mar 2, 2018

replace FAILED deployments with helm upgrade --install --force #3597

Merged

bacongobbler closed this as completed in #3597 Mar 9, 2018

IdanAdar mentioned this issue May 2, 2018

Best practice for installing and/orupgrading a deployed and/or failed release #4004

Closed

mumoshu mentioned this issue May 30, 2018

Helmfile doesn't work when no internet access roboll/helmfile#155

Closed

alejandroEsc mentioned this issue Jun 6, 2018

[stable/etcd-operator]: Error: apiVersion "etcd.database.coreos.com/v1beta2" in templates/etcd-cluster-crd.yaml is not available helm/charts#5328

Closed

This was referenced Aug 30, 2018

fix(tiller): upgrade last deployed release #3097

Merged

feat: Automated rollback of failed release roboll/helmfile#256

Open

mastachand mentioned this issue Sep 6, 2018

Deployment of prometheus-operator sometimes fails scality/metalk8s#237

Closed

ashinohara mentioned this issue Oct 4, 2018

apply fails if the previous helm install command fails djhaskin987/terraform-provider-helmcmd#7

Closed

l0rd mentioned this issue Mar 4, 2019

Command server:start fails if che helm chart has status DELETE or DELETING che-incubator/chectl#18

Closed

scottyhq mentioned this issue Apr 5, 2019

changed load balancer ip pangeo-data/pangeo-cloud-federation#207

Merged

alexnuttinck mentioned this issue Aug 27, 2019

Problem with upgrade cetic/helm-nifi#11

Closed

phumberdroz mentioned this issue Dec 17, 2019

Helm 3 upgrade --install faild has no deployed release #7257

Closed

barry-dow mentioned this issue Mar 5, 2020

Helm v3.0.3 not protecting from overwriting successful deployments #7736

Closed

schollii mentioned this issue Mar 23, 2020

dealing with failed releases when upgrade/install #7793

Closed

peterholak mentioned this issue Apr 14, 2020

app-name has no deployed releases #5595

Closed

bacongobbler mentioned this issue Apr 14, 2020

fix: allow upgrading when history has nothing but failures #7913

Closed

dcow mentioned this issue Jul 8, 2020

Feature Request: helm deploy subcommand #8415

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`helm upgrade --install` doesn't perform an install/upgrade if the first ever install fails #3353

`helm upgrade --install` doesn't perform an install/upgrade if the first ever install fails #3353

jstrachan commented Jan 17, 2018

bacongobbler commented Jan 17, 2018 •

edited

chancez commented Jan 22, 2018

chancez commented Jan 22, 2018

stealthybox commented Jan 25, 2018 •

edited

bmperrea commented Jan 31, 2018 •

edited

bacongobbler commented Feb 1, 2018

sorenmat commented Feb 1, 2018

whereisaaron commented Feb 2, 2018 •

edited

sorenmat commented Feb 2, 2018

rchernobelskiy commented Feb 15, 2018

bacongobbler commented Feb 15, 2018 •

edited

gmanolache commented Apr 27, 2018

stealthybox commented Apr 27, 2018

whereisaaron commented Apr 28, 2018

bacongobbler commented Apr 28, 2018

whereisaaron commented Apr 29, 2018

gmanolache commented Apr 30, 2018

IdanAdar commented May 2, 2018

RickS-C137 commented Jan 23, 2019

eyalzek commented Jan 30, 2019

whereisaaron commented Feb 4, 2019

konokimo commented Feb 13, 2019

walkafwalka commented Mar 2, 2019 •

edited

walkafwalka commented Mar 3, 2019 •

edited

whereisaaron commented Mar 4, 2019

kylecordes commented Apr 8, 2019

dvlato commented Jul 11, 2019

schollii commented Mar 23, 2020

dcow commented Jul 7, 2020

hickeyma commented Jul 7, 2020

dcow commented Jul 7, 2020 •

edited

hickeyma commented Jul 7, 2020

dcow commented Jul 8, 2020

helm upgrade --install doesn't perform an install/upgrade if the first ever install fails #3353

helm upgrade --install doesn't perform an install/upgrade if the first ever install fails #3353

Comments

jstrachan commented Jan 17, 2018

bacongobbler commented Jan 17, 2018 • edited

chancez commented Jan 22, 2018

chancez commented Jan 22, 2018

stealthybox commented Jan 25, 2018 • edited

bmperrea commented Jan 31, 2018 • edited

bacongobbler commented Feb 1, 2018

sorenmat commented Feb 1, 2018

whereisaaron commented Feb 2, 2018 • edited

sorenmat commented Feb 2, 2018

rchernobelskiy commented Feb 15, 2018

bacongobbler commented Feb 15, 2018 • edited

gmanolache commented Apr 27, 2018

stealthybox commented Apr 27, 2018

whereisaaron commented Apr 28, 2018

bacongobbler commented Apr 28, 2018

whereisaaron commented Apr 29, 2018

gmanolache commented Apr 30, 2018

IdanAdar commented May 2, 2018

RickS-C137 commented Jan 23, 2019

eyalzek commented Jan 30, 2019

whereisaaron commented Feb 4, 2019

konokimo commented Feb 13, 2019

walkafwalka commented Mar 2, 2019 • edited

walkafwalka commented Mar 3, 2019 • edited

whereisaaron commented Mar 4, 2019

kylecordes commented Apr 8, 2019

dvlato commented Jul 11, 2019

schollii commented Mar 23, 2020

dcow commented Jul 7, 2020

hickeyma commented Jul 7, 2020

dcow commented Jul 7, 2020 • edited

hickeyma commented Jul 7, 2020

dcow commented Jul 8, 2020

`helm upgrade --install` doesn't perform an install/upgrade if the first ever install fails #3353

`helm upgrade --install` doesn't perform an install/upgrade if the first ever install fails #3353

bacongobbler commented Jan 17, 2018 •

edited

stealthybox commented Jan 25, 2018 •

edited

bmperrea commented Jan 31, 2018 •

edited

whereisaaron commented Feb 2, 2018 •

edited

bacongobbler commented Feb 15, 2018 •

edited

walkafwalka commented Mar 2, 2019 •

edited

walkafwalka commented Mar 3, 2019 •

edited

dcow commented Jul 7, 2020 •

edited