-
Notifications
You must be signed in to change notification settings - Fork 268
init-assets should run until it works... #937
Conversation
please wait for review from @s-urbaniak before merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR tries to delegate restart logic towards systemd and the outlined changes navigate around limitations in systemd. While I understand what we are trying to achieve here, I am not convinced that this is the right approach.
First this PR unfortunately breaks "vanilla" k8s functionality and the separation of concerns between the three systemd service units init-assets.service
, bootkube.service
, and tectonic.service
.
All of these services have been marked as Type=oneshot
conciously to have a clear happens-before-relationship for the startup ordering between them. If and only if init-assets
succeeds successfully, it is marked as started, causing bootkube
to start. If and only if bootkube
is started successfully, tectonic
is being as started. All of this models the serial startup behavior in a pretty simple manner.
Now we want to introduce restart semantics using Restart=...
. This unfortunately doesn't work with Type=oneshot
so this PR removes the directive, marking them as Type=simple
effectively.
I was able to simulate the above described behavior using Type=fork
. But even marking the services as Type=fork
in conjunction with Restart=...
we have a problematic edge case. If the init-assets
service fails, the bootkube
service also fails with a "failed dependency" error. systemd would try to restart init-assets
, but not bootkube
, even if init-assets
eventually succeeds after some retries.
I suppose that is the reason why this PR compacts all the script invocation in an ExecStartPre
directive which overcomes this problem.
We could overcome this too by introducing dedicated target, eventually reached once the preceding service succeeds, but this introduces stage-based complexity bloat.
So I see two options:
- Implement the retry logic inside
init-assets.sh
itself and leave the current (simple) construct. - Use
Restart=on-failure
. If we want to retain the vanilla k8s functionality we would have to make this unit a systemd template unit and inject a true/false value for disabling/enabling tectonic.
Im leaning towards solution 1. as this is effectively what we already do in tectonic.sh
.
/cc @lucab with whom I have discussed subtle systemd semantics.
|
||
data "ignition_systemd_unit" "tectonic" { | ||
name = "tectonic.service" | ||
enable = "${var.tectonic_service_disabled == 0 ? true : false}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By removing both the tectonic and bootkube service we are disabling the functionality to install a vanilla k8s cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but why is that a feature we support in the tectonic installer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was requested in #133 and this feature is currently used by the Prometheus team in their CI pipeline on AWS to test operator versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link.
@@ -125,14 +123,3 @@ data "ignition_systemd_unit" "init-assets" { | |||
enable = "${var.assets_s3_location != "" ? true : false}" | |||
content = "${file("${path.module}/resources/services/init-assets.service")}" | |||
} | |||
|
|||
data "ignition_systemd_unit" "bootkube" { | |||
name = "bootkube.service" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the bootkube service actually removed?
|
||
[Service] | ||
Type=oneshot | ||
RemainAfterExit=true | ||
Restart=always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will always restart the service, also when it exits successfully, see https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=, effectively busy-looping after installation. The only thing preventing it from busy-looping is the ConditionPathExists which seems like a fragile construct to me.
I strongly suggest to set Restart=on-failure
. This will not potentially busy-loop, if the scripts exits with 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
[Service] | ||
Type=oneshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, indeed Type=oneshot
does not work with Restart
, that's a bummer. There is a corresponding upstream issue: systemd/systemd#2582
|
||
ExecStart=/usr/bin/bash /opt/tectonic/init-assets.sh | ||
ExecStartPost=/bin/touch /opt/tectonic/init_assets.done | ||
TimeoutSec=infinity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raised my eyebrows, but is also the effective setting when Type=oneshot
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ExecStartPres timeout otherwise.
To add more solutions to the brainstorm-table:
|
Is this feature explicitly supported? If so, I'd be happy to rework this PR.
Init-assets, bootkube, and tectonic are tightly coupled in any case. They must run sequentially, may take an indefinite amount of time, and they should only run once, ever. Spreading the coupling across different services does not reduce the complexity of the system or make it easier to reason about.
Sure, but you don't get a cluster if any of the scripts exit for any reason.
Mostly. I believe PostExecs only run if everything exits 0, so the lockfiles won't be around if the units fail.
Agreed.
I disagree given the general impossibility of writing robust, mission critical bash and the number of existing bugs in the current scripts - init-assets.sh, tectonic.sh and bootkube.sh. In so many words, if bash were capable of doing this, I suspect systemd wouldn't exist. |
@kans thanks a lot for the good discussion! I totally agree about the robustness point about bash ;-) I was even thinking about implementing all the You have a good point about the complexity of orchestrating those three service unit files though which can be avoided. Once I am in the office I'll piggy-pack on this PR to create a follow-up commit using a templated systemd service unit file. |
@kans so I iterated on idea 2. described above and tested a commit which builds on top of this PR, PTAL: Also to have qualified answers to your comments I iterated on the single bullet items:
Please see #937 (comment)
You have a point, sure. This reasoning also amplifies my conjecture to program at least
Agreed
That is indeed correct.
Totally agreed as described above. |
Note: this is to fix #924 |
Approved by mistake. Was trying to read through the issue. |
Thanks for the links and backstory. Simple services and |
I looked for other tickets, but it looks like this has not been resolved yet. another failure example in init-assets
|
This PR also makes bootkube.sh and tecotnic.sh run sequentially after init-assets completes which was the behavior before #177 . This PR does not delete all dead code or increase test timeouts which we would probably need to bump by 5 minutes.