-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading Fedora CoreOS hosts #87
Comments
@masterzen I'm looking for a similar solution. Initially i thought
Another possible solution which came to my mind is using Did you manage to have a proper solution using Would you mind sharing your |
sounds like some overlap with #63 ? |
@brandond yes, you are right! |
for what it's worth, I maintain a fork of fleetlock that adds a simple maintenance window: https://github.com/craigcabrey/fleetlock |
This is more a prospective ticket than an actual feature request, and to gather insight about using SUC in this context.
There's currently a lack of option regarding upgrading a fleet of Fedora CoreOS based kubernetes cluster, whereas we had coreos/container-linux-update-operator for CoreOS.
FCOS uses the coreos/zincati tools check with a Cincinnati server for new upgrades and to download/apply them. The process can be controlled by another protocol called FleetLock that can ensure only one host reboots at a time. The problem is that this solution is not kubernetes aware, unlike SUC is. That's the reason when we migrated to FCOS that we based our auto-upgrade system on SUC.
I've used it successfully to upgrade FCOS hosts by using an upgrade script that calls the
rpm-ostree
tool (the underlying upgrade command). Unfortunately, this doesn't work completely (see coreos/fedora-coreos-tracker#536 for a discussion of the issue).Notably the issue is that the last step of my naive upgrade script automatically executes a reboot which kills the container. The job is consequently marked as in error, and when the node reboots, the container is rescheduled again (hopefully not doing anything). This increases the time it takes to rollout an upgrade. I have yet to find a solution to this issue, because there's a race between making sure the machine reboots (to apply the update) and signalling that the update has been performed correctly to SUC.
The other issue is that I have to maintain manually the version number to upgrade to in the Plan. So for instance if there's a new FCOS version, I manually update the
spec.version
plan field to trigger the upgrade.My initial plan was to develop a service that would on one side implement the SUC
channel
system and on the other side the Cincinnati protocol so that plan would be triggered when the Cincinnati server would report the existence of a new version.In retrospect, I'm wondering if this shouldn't be part of SUC itself, instead of being in another service. Would you accept a PR to implement a configurable channel system, where one of the implementation would be the Cincinnati protocol ?
In short, beside my questions above, I'm wondering how we can better connect FCOS upgrade tools (Zincati, etc) to SUC to build a powerful k8s based FCOS upgrade system.
Thanks,
/cc @lucab
The text was updated successfully, but these errors were encountered: