New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a bootstrap stage to allow a two stage node boot process #1115
Comments
This will be very useful. I don't think it has to be complicated though with bootstrap overlays. All we need - some minimal os that boots first and then can properly load big OS container, setting up disks, grab secrets, etc. This is exactly how it is done in Brigth Cluster Manager as far as I remember and it worked perfectly well. No issues with old BIOS/pxe/whatever, also - easy stateless setup, but with disks. Meaning - every time node boots, disks are wiped out and OS just rsynced to disks. |
Treating the bootstrap exactly the way a normal container is treated keeps things consistent. Allowing the bootstrap to have overlays brings that same flexibility to the bootstrap phase and is effectively already written in WW4. Rather than having a single bootstrap which is hard-coded, a site might want to build their own with A follow up to this RFE would definitely be that WW should include a standard and documented bootstrap container that gives new users a working default, but given that it's just as simple to allow complete freedom for local sites to build their own bootstraps and processes, it'd be a shame to restrict to a single "approved" bootstrap image/process. Also I want WW to be BETTER than Bright, not LIKE Bright 😁 |
I'm a supporter of this idea. If you came from WW3 (or even WW2/Perceus) you know that has been the model for years (2-stage provisioning). I believe we had some discussion on this during the early days of WW4 project and @gmkurtzer decided to go to current model. From what I recall the original argument was like, if you are already provisioning a container to the node, why not just provision it directly, instead of relying on another stager/bootstrap to provision it for you. But I do see the value of being flexible. Technically this is not too difficult to do with current model as well and you can build a 1st stage container that works just like the old bootstrap, and perform the necessary tasks such as preparing disks that @msquantori would like to do, then wget the second stage container from the same warewulfd, extract it, and switch_root to it. |
Agree. I honestly think that as a first step – even simple documentation of how to do it with the current setup will be extremely helpful. I’m pretty sure it is possible to automate it with Node tags where we specify “true” container and some other parameters. Though – built-in functionality will be more convenient.
Basically, there are two cases that are interesting for me:
1) Boot bootstrap container -> load up larger production container into RAM ( this will solve all the issues with PXE ).
2) Boot bootstrap container -> sync production container onto HDDs ( this will solve issues, when your container is big for whatever reason, like VDI nodes, some specific software installations, etc )
We still stick with stateless concept, but just have more options to provision. Right now – not being able to use large containers without dancing around - is a big limitation.
From: Yong Qin ***@***.***>
Date: Wednesday, March 6, 2024 at 12:58 PM
To: warewulf/warewulf ***@***.***>
Cc: Mikhail Serkov ***@***.***>, Mention ***@***.***>
Subject: Re: [warewulf/warewulf] Add a bootstrap stage to allow a two stage node boot process. (Issue #1115)
I'm a supporter of this idea. If you came from WW3 (or even WW2/Perceus) you know that has been the model for years (2-stage provisioning). I believe we had some discussion on this during the early days of WW4 project and @gmkurtzer<https://github.com/gmkurtzer> decided to go to current model. From what I recall the original argument was like, if you are already provisioning a container to the node, why not just provision it directly, instead of relying on another stager/bootstrap to provision it for you. But I do see the value of being flexible. Technically this is not too difficult to do with current model as well and you can build a 1st stage container that works just like the old bootstrap, and perform the necessary tasks such as preparing disks that @msquantori<https://github.com/msquantori> would like to do, then wget the second stage container from the same warewulfd, extract it, and switch_root to it.
—
Reply to this email directly, view it on GitHub<#1115 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AVVFAH6VKYTY6O2AGASMP4LYW5KKPAVCNFSM6AAAAABD77FNEKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBRGQ3TONZVGI>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
This email and any attachments hereto are confidential and belong to Quantori. If you received it by mistake, please let us know and proceed immediately to deleting it. Copying this message or disclosing its content to third parties is strictly prohibited.
|
My use case for the 2-stage provisioning is different than yours. So just throw it out as well for some thoughts. We have some hardware (NIC in this case) that lack of ipxe support, basically after some fiddling with ipxe itself I was only able to get it negotiate to 100 Mbps instead of the expected 1 Gbps (I think there are similar reports in ipxe community as well). This causes downloading the full-blown container image taking a lot of time. However the OS driver does not have that limit so the use case would be to just use the ipxe to download the 1st stage container as small as possible, then load up the OS driver and download the 2nd stage full-blown container. Yes I agree that a native support would be ideal but just saying that this isn't a showstopper with the current model. |
Summary
Request is for
warewulfd
to serve aBootStrapContainer
in the node configBootStrapKernel
in the node configBootStrapSystemOverlay
BootStrapRuntimeOverlay
This would allow optionally booting nodes with an arbitrary kernel and container, which could then perform whatever actions are needed before downloading the final kernel/container/overlays and pivot/switch-root-ing and/or doing a
kexec()
.Rationale
This enhancement would allow easily testing using a shim bootstrap, whether that is a dracut initrd, custom bootstrap image or just a minimal OS container. Potential applications/benefits are:
grub
and any other steps required to statefully provision nodestmpfs
and depending on swapping to free memoryDescription
BootStrap [Container|Kernel|SystemOverlay|RuntimeOverlay]
in the node configstage_*
requests towarewulfd
The BootStrap containers, kernels and overlays can be managed just like any other containers, kernels and overlays.
Additional information
Not sure how this would look for
ipxe
orgrub
or any other boot methods, so leaving those as an exercise for the admin after the bootstrrap functionality is available.General information
The text was updated successfully, but these errors were encountered: