init-assets should run until it works... #937

kans · 2017-05-31T02:30:45Z

This PR also makes bootkube.sh and tecotnic.sh run sequentially after init-assets completes which was the behavior before #177 . This PR does not delete all dead code or increase test timeouts which we would probably need to bump by 5 minutes.

sym3tri · 2017-05-31T20:45:59Z

please wait for review from @s-urbaniak before merging

s-urbaniak

This PR tries to delegate restart logic towards systemd and the outlined changes navigate around limitations in systemd. While I understand what we are trying to achieve here, I am not convinced that this is the right approach.

First this PR unfortunately breaks "vanilla" k8s functionality and the separation of concerns between the three systemd service units init-assets.service, bootkube.service, and tectonic.service.

All of these services have been marked as Type=oneshot conciously to have a clear happens-before-relationship for the startup ordering between them. If and only if init-assets succeeds successfully, it is marked as started, causing bootkube to start. If and only if bootkube is started successfully, tectonic is being as started. All of this models the serial startup behavior in a pretty simple manner.

Now we want to introduce restart semantics using Restart=.... This unfortunately doesn't work with Type=oneshot so this PR removes the directive, marking them as Type=simple effectively.

I was able to simulate the above described behavior using Type=fork. But even marking the services as Type=fork in conjunction with Restart=... we have a problematic edge case. If the init-assets service fails, the bootkube service also fails with a "failed dependency" error. systemd would try to restart init-assets, but not bootkube, even if init-assets eventually succeeds after some retries.

I suppose that is the reason why this PR compacts all the script invocation in an ExecStartPre directive which overcomes this problem.

We could overcome this too by introducing dedicated target, eventually reached once the preceding service succeeds, but this introduces stage-based complexity bloat.

So I see two options:

Implement the retry logic inside init-assets.sh itself and leave the current (simple) construct.
Use Restart=on-failure. If we want to retain the vanilla k8s functionality we would have to make this unit a systemd template unit and inject a true/false value for disabling/enabling tectonic.

Im leaning towards solution 1. as this is effectively what we already do in tectonic.sh.

/cc @lucab with whom I have discussed subtle systemd semantics.

s-urbaniak · 2017-06-01T08:17:59Z

modules/aws/ignition/ignition.tf

-
-data "ignition_systemd_unit" "tectonic" {
-  name    = "tectonic.service"
-  enable  = "${var.tectonic_service_disabled == 0 ? true : false}"


By removing both the tectonic and bootkube service we are disabling the functionality to install a vanilla k8s cluster.

Yes, but why is that a feature we support in the tectonic installer?

It was requested in #133 and this feature is currently used by the Prometheus team in their CI pipeline on AWS to test operator versions.

Thanks for the link.

s-urbaniak · 2017-06-01T08:18:14Z

modules/aws/ignition/ignition.tf

@@ -125,14 +123,3 @@ data "ignition_systemd_unit" "init-assets" {
  enable  = "${var.assets_s3_location != "" ? true : false}"
  content = "${file("${path.module}/resources/services/init-assets.service")}"
 }
-
-data "ignition_systemd_unit" "bootkube" {
-  name    = "bootkube.service"


Why is the bootkube service actually removed?

s-urbaniak · 2017-06-01T08:21:34Z

modules/aws/ignition/resources/services/init-assets.service


 [Service]
-Type=oneshot
-RemainAfterExit=true
+Restart=always


This will always restart the service, also when it exits successfully, see https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=, effectively busy-looping after installation. The only thing preventing it from busy-looping is the ConditionPathExists which seems like a fragile construct to me.

I strongly suggest to set Restart=on-failure. This will not potentially busy-loop, if the scripts exits with 0.

s-urbaniak · 2017-06-01T08:56:43Z

modules/aws/ignition/resources/services/init-assets.service


 [Service]
-Type=oneshot


Hmm, indeed Type=oneshot does not work with Restart, that's a bummer. There is a corresponding upstream issue: systemd/systemd#2582

s-urbaniak · 2017-06-01T09:22:11Z

modules/aws/ignition/resources/services/init-assets.service

-
-ExecStart=/usr/bin/bash /opt/tectonic/init-assets.sh
-ExecStartPost=/bin/touch /opt/tectonic/init_assets.done
+TimeoutSec=infinity


This raised my eyebrows, but is also the effective setting when Type=oneshot.

The ExecStartPres timeout otherwise.

s-urbaniak · 2017-06-01T13:17:52Z

To add more solutions to the brainstorm-table:

As shown in kube-aws: decrypt-tls-assets.service failed in a controller node coreos-kubernetes#675 (comment) we could ExecStartPre=/usr/bin/systemctl is-active init-assets.service in bootkube.service and ExecStartPre=/usr/bin/systemctl is-active bootkube.service in tectonic.service.
We could add a https://www.freedesktop.org/software/systemd/man/systemd.path.html#PathExists= path unit which will activate bootkube.service.

kans · 2017-06-01T20:09:32Z

First this PR unfortunately breaks "vanilla" k8s functionality

Is this feature explicitly supported? If so, I'd be happy to rework this PR.

...and [breaks] the separation of concerns between the three systemd service units init-assets.service, bootkube.service, and tectonic.service.

Init-assets, bootkube, and tectonic are tightly coupled in any case. They must run sequentially, may take an indefinite amount of time, and they should only run once, ever. Spreading the coupling across different services does not reduce the complexity of the system or make it easier to reason about.

All of these services have been marked as Type=oneshot conciously to have a clear happens-before-relationship for the startup ordering between.

Sure, but you don't get a cluster if any of the scripts exit for any reason.

All of this models the serial startup behavior in a pretty simple manner.

Mostly. I believe PostExecs only run if everything exits 0, so the lockfiles won't be around if the units fail.

We could overcome this too by introducing dedicated target, eventually reached once the preceding service succeeds, but this introduces stage-based complexity bloat.

Agreed.

... This PR tries to delegate restart logic towards systemd and the outlined changes navigate around limitations in systemd. While I understand what we are trying to achieve here, I am not convinced that this is the right approach. [We should]implement the retry logic inside init-assets.sh itself and leave the current (simple) construct

I disagree given the general impossibility of writing robust, mission critical bash and the number of existing bugs in the current scripts - init-assets.sh, tectonic.sh and bootkube.sh. In so many words, if bash were capable of doing this, I suspect systemd wouldn't exist.

s-urbaniak · 2017-06-02T03:53:49Z

@kans thanks a lot for the good discussion! I totally agree about the robustness point about bash ;-) I was even thinking about implementing all the tectonic.sh and init-assets logic in Go (bootkube is already invoked directly) for that reason.

You have a good point about the complexity of orchestrating those three service unit files though which can be avoided. Once I am in the office I'll piggy-pack on this PR to create a follow-up commit using a templated systemd service unit file.

s-urbaniak · 2017-06-02T12:13:05Z

@kans so I iterated on idea 2. described above and tested a commit which builds on top of this PR, PTAL:

s-urbaniak@1f1d821

Also to have qualified answers to your comments I iterated on the single bullet items:

First this PR unfortunately breaks "vanilla" k8s functionality

Is this feature explicitly supported? If so, I'd be happy to rework this PR.

Please see #937 (comment)

...and [breaks] the separation of concerns between the three systemd service units init-assets.service, bootkube.service, and tectonic.service.

Init-assets, bootkube, and tectonic are tightly coupled in any case. They must run sequentially, may take an indefinite amount of time, and they should only run once, ever. Spreading the coupling across different services does not reduce the complexity of the system or make it easier to reason about.

You have a point, sure. This reasoning also amplifies my conjecture to program at least tectonic.sh and init-assets.sh as a Go program which runs atomically in a systemd service unit similar to bootkube.

All of these services have been marked as Type=oneshot conciously to have a clear happens-before-relationship for the startup ordering between.

Sure, but you don't get a cluster if any of the scripts exit for any reason.

Agreed

All of this models the serial startup behavior in a pretty simple manner.

Mostly. I believe PostExecs only run if everything exits 0, so the lockfiles won't be around if the units fail.

That is indeed correct.

... This PR tries to delegate restart logic towards systemd and the outlined changes navigate around limitations in systemd. While I understand what we are trying to achieve here, I am not convinced that this is the right approach. [We should]implement the retry logic inside init-assets.sh itself and leave the current (simple) construct

I disagree given the general impossibility of writing robust, mission critical bash and the number of existing bugs in the current scripts - init-assets.sh, tectonic.sh and bootkube.sh. In so many words, if bash were capable of doing this, I suspect systemd wouldn't exist.

Totally agreed as described above.

sym3tri · 2017-06-02T18:00:56Z

Note: this is to fix #924

alexsomesan · 2017-06-02T20:34:17Z

Approved by mistake. Was trying to read through the issue.
Please disregard.

Mistakenly approved

kans · 2017-06-02T22:03:25Z

@s-urbaniak @alexsomesan

Thanks for the links and backstory.

Simple services and systemctl is-active seems like the way to go. I'm going to close this PR and open a new one.

sstarcher · 2017-09-15T22:01:37Z

I looked for other tickets, but it looks like this has not been resolved yet.

another failure example in init-assets

Sep 15 21:44:21 ip-10-0-2-235 bash[758]: pubkey: prefix: "quay.io/coreos/awscli"
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: key: "https://quay.io/aci-signing-key"
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: gpg key fingerprint is: BFF3 13CD AA56 0B16 A898  7B8F 72AB F5F6 799D 33BC
Sep 15 21:44:21 ip-10-0-2-235 bash[758]:         Quay.io ACI Converter (ACI conversion signing key) <support@quay.io>
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: Trusting "https://quay.io/aci-signing-key" for prefix "quay.io/coreos/awscli" without fingerprint review.
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: Added key for prefix "quay.io/coreos/awscli" at "/etc/rkt/trustedkeys/prefix.d/quay.io/coreos/awscli/bff313cdaa560b16a8987b8f72abf5f679
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: Downloading signature:  0 B/473 B
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: Downloading signature:  473 B/473 B
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: Downloading signature:  473 B/473 B
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: Downloading ACI:  0 B/32 MB
Sep 15 21:44:21 ip-10-0-2-235 bash[758]: Downloading ACI:  16.3 KB/32 MB
Sep 15 21:44:22 ip-10-0-2-235 bash[758]: Downloading ACI:  32 MB/32 MB
Sep 15 21:44:22 ip-10-0-2-235 bash[758]: image: signature verified:
Sep 15 21:44:22 ip-10-0-2-235 bash[758]:   Quay.io ACI Converter (ACI conversion signing key) <support@quay.io>
Sep 15 21:44:29 ip-10-0-2-235 bash[758]: Error: Mounting "/etc/rkt-resolv.conf" on "/opt/stage2/awscli/rootfs//etc/resolv.conf" failed: No such file or directory
Sep 15 21:44:29 ip-10-0-2-235 bash[758]: cat: /run/metadata/master: No such file or directory
Sep 15 21:44:30 ip-10-0-2-235 systemd[1]: init-assets.service: Main process exited, code=exited, status=1/FAILURE
Sep 15 21:44:30 ip-10-0-2-235 systemd[1]: Failed to start Download Assets.
Sep 15 21:44:30 ip-10-0-2-235 systemd[1]: init-assets.service: Unit entered failed state.
Sep 15 21:44:30 ip-10-0-2-235 systemd[1]: init-assets.service: Failed with result 'exit-code'.```

WIP

c6f8d21

kans changed the title ~~WIP init-assets should run until it works...~~ init-assets should run until it works... May 31, 2017

kans mentioned this pull request Jun 1, 2017

API server fails to start when bootstrapping cluster #926

Closed

s-urbaniak suggested changes Jun 1, 2017

View reviewed changes

alexsomesan previously approved these changes Jun 2, 2017

View reviewed changes

kans closed this Jun 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

init-assets should run until it works... #937

init-assets should run until it works... #937

kans commented May 31, 2017 •

edited

sym3tri commented May 31, 2017

s-urbaniak left a comment •

edited

s-urbaniak Jun 1, 2017

kans Jun 1, 2017 •

edited

s-urbaniak Jun 2, 2017

kans Jun 2, 2017

s-urbaniak Jun 1, 2017

s-urbaniak Jun 1, 2017

kans Jun 1, 2017 •

edited

s-urbaniak Jun 1, 2017

s-urbaniak Jun 1, 2017

kans Jun 1, 2017

s-urbaniak commented Jun 1, 2017 •

edited

kans commented Jun 1, 2017

s-urbaniak commented Jun 2, 2017

s-urbaniak commented Jun 2, 2017

sym3tri commented Jun 2, 2017

alexsomesan commented Jun 2, 2017

kans commented Jun 2, 2017

sstarcher commented Sep 15, 2017

init-assets should run until it works... #937

init-assets should run until it works... #937

Conversation

kans commented May 31, 2017 • edited

sym3tri commented May 31, 2017

s-urbaniak left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kans Jun 1, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kans Jun 1, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s-urbaniak commented Jun 1, 2017 • edited

kans commented Jun 1, 2017

s-urbaniak commented Jun 2, 2017

s-urbaniak commented Jun 2, 2017

sym3tri commented Jun 2, 2017

alexsomesan commented Jun 2, 2017

kans commented Jun 2, 2017

sstarcher commented Sep 15, 2017

kans commented May 31, 2017 •

edited

s-urbaniak left a comment •

edited

kans Jun 1, 2017 •

edited

kans Jun 1, 2017 •

edited

s-urbaniak commented Jun 1, 2017 •

edited