You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When tilt ci runs, if it finds a job already exists (and the pod-template-hash matches) but the job has never completed (all pods exited with error) it would recreate the job to start it again.
I'm not sure if this is a bug or a feature request, because I can't find any docs that say it should work this way. Maybe I'm assuming something based on how tilt up behaves?
Current Behavior
tilt seems to attach to an arbitrary pod that has already terminated (it does not appear to be the most recent or the oldest pod). The output in the logs is:
Attaching to existing pod (db-init-cnbxf). Only new logs will be streamed.
Then tilt ci exits immediately with error: Error: Pod "db-init-cnbxf" failed.
Run tilt ci once, and the output from this job is printed with the date.
Run tilt ci again many times and the pod never runs again (the job controller will recreate the pod occasionally if the spec allows it). tilt ci says its' attaching to the terminated pod, then exits.
Context
About Your Use Case
We create environments for CI ahead of time using tilt ci. When one of those fails due to some flaky test or infrastructure problem we attempt to retry with tilt ci. We've noticed those retries don't end up working most of the time because of this behaviour.
The text was updated successfully, but these errors were encountered:
oof, this is tricky. The short version is that this is currently working as designed.
re: "I can't find any docs that say it should work this way" - here's a good doc on tilt's execution model - https://docs.tilt.dev/controlloop. Basically, you can think of it as docker build && kubectl apply && kubectl wait. tilt ci mainly adds exit conditions.
The fundamental problem is that if you kubectl apply a job, and the spec of the Job hasn't changed, then (from Kubernetes' perspective), there's no reason to re-run the job. From the apiserver perspective, the whole contract of apply if that if the spec of an object hasn't changed, then the system should do nothing.
Tilt inherits this behavior -- if the Job hasn't changed, then the Job shouldn't be re-run.
There have been discussions of this over the years (e.g., kubernetes/kubernetes#77396), but lots of stuff relies on this behavior.
I guess the simple workaround right now is to add to your tiltfile like:
if config.tilt_subcommand == 'ci'):
local('./clean-up-old-jobs.sh')
Expected Behavior
When
tilt ci
runs, if it finds a job already exists (and thepod-template-hash
matches) but the job has never completed (all pods exited with error) it would recreate the job to start it again.I'm not sure if this is a bug or a feature request, because I can't find any docs that say it should work this way. Maybe I'm assuming something based on how
tilt up
behaves?Current Behavior
tilt
seems to attach to an arbitrary pod that has already terminated (it does not appear to be the most recent or the oldest pod). The output in the logs is:Then
tilt ci
exits immediately with error:Error: Pod "db-init-cnbxf" failed
.Steps to Reproduce
Given these files:
script.sh
Tiltfile
Run
tilt ci
once, and the output from this job is printed with the date.Run
tilt ci
again many times and the pod never runs again (the job controller will recreate the pod occasionally if the spec allows it).tilt ci
says its' attaching to the terminated pod, then exits.Context
About Your Use Case
We create environments for CI ahead of time using
tilt ci
. When one of those fails due to some flaky test or infrastructure problem we attempt to retry withtilt ci
. We've noticed those retries don't end up working most of the time because of this behaviour.The text was updated successfully, but these errors were encountered: