Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading Fedora CoreOS hosts #87

Open
masterzen opened this issue Jun 16, 2020 · 4 comments
Open

Upgrading Fedora CoreOS hosts #87

masterzen opened this issue Jun 16, 2020 · 4 comments

Comments

@masterzen
Copy link

masterzen commented Jun 16, 2020

This is more a prospective ticket than an actual feature request, and to gather insight about using SUC in this context.

There's currently a lack of option regarding upgrading a fleet of Fedora CoreOS based kubernetes cluster, whereas we had coreos/container-linux-update-operator for CoreOS.

FCOS uses the coreos/zincati tools check with a Cincinnati server for new upgrades and to download/apply them. The process can be controlled by another protocol called FleetLock that can ensure only one host reboots at a time. The problem is that this solution is not kubernetes aware, unlike SUC is. That's the reason when we migrated to FCOS that we based our auto-upgrade system on SUC.

I've used it successfully to upgrade FCOS hosts by using an upgrade script that calls the rpm-ostree tool (the underlying upgrade command). Unfortunately, this doesn't work completely (see coreos/fedora-coreos-tracker#536 for a discussion of the issue).
Notably the issue is that the last step of my naive upgrade script automatically executes a reboot which kills the container. The job is consequently marked as in error, and when the node reboots, the container is rescheduled again (hopefully not doing anything). This increases the time it takes to rollout an upgrade. I have yet to find a solution to this issue, because there's a race between making sure the machine reboots (to apply the update) and signalling that the update has been performed correctly to SUC.

The other issue is that I have to maintain manually the version number to upgrade to in the Plan. So for instance if there's a new FCOS version, I manually update thespec.version plan field to trigger the upgrade.

My initial plan was to develop a service that would on one side implement the SUC channel system and on the other side the Cincinnati protocol so that plan would be triggered when the Cincinnati server would report the existence of a new version.

In retrospect, I'm wondering if this shouldn't be part of SUC itself, instead of being in another service. Would you accept a PR to implement a configurable channel system, where one of the implementation would be the Cincinnati protocol ?

In short, beside my questions above, I'm wondering how we can better connect FCOS upgrade tools (Zincati, etc) to SUC to build a powerful k8s based FCOS upgrade system.

Thanks,

/cc @lucab

@bitfisher
Copy link

bitfisher commented Nov 16, 2023

@masterzen I'm looking for a similar solution.

Initially i thought fleetlock is the way to go but unfortunately it lacks support of maintenance window.

zincati also doesn't support a combination of fleet_lock and perodic strategy.
There is an open feature request for this in coreos/zincati#1014.

Another possible solution which came to my mind is using systemd-timers and zincati with fleet_lock.
One timer (maintencance window start) will start zincati and one timer (maintenance window end) will stop zincati.
But this approach seems to be more than ugly.

Did you manage to have a proper solution using system-upgrade-controller?

Would you mind sharing your plan and actual update script?

@brandond
Copy link
Contributor

sounds like some overlap with #63 ?

@bitfisher
Copy link

@brandond yes, you are right!
thanks for the pointer to kured ;)
this wasn't on my radar yet.

@craigcabrey
Copy link

for what it's worth, I maintain a fork of fleetlock that adds a simple maintenance window: https://github.com/craigcabrey/fleetlock

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants