Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gokrazy Update Service (GUS) #167

Open
20 of 30 tasks
stapelberg opened this issue Feb 5, 2023 · 7 comments
Open
20 of 30 tasks

gokrazy Update Service (GUS) #167

stapelberg opened this issue Feb 5, 2023 · 7 comments

Comments

@stapelberg
Copy link
Contributor

stapelberg commented Feb 5, 2023

related to #25

@damdo and I have been working on a little project called gokrazy Update Service (GUS).

You can find the design document at https://docs.google.com/document/d/1q2TbB3DlO01_1EO7sQqFH4eMJE21XzniMs5hCqfqM8A/edit?usp=sharing

This issue tracks overall progress. Feel free to leave a comment or ask a question if you have any feedback!

  • gok sbom subcommand
  • github.com/gokrazy/gokrazy/cmd/heartbeat program
    • add a public method to fetch machine-id (and fallback to hostname)
  • github.com/gokrazy/gus/cmd/gus-server
    • SQLite database
    • Heartbeat API
    • UI
    • Push API
    • Ingest API
    • update desired image
    • GetUpdate API
    • AttemptUpdate API
    • garbage-collect old images: image needs to be unused, keep last 3 per machine id pattern
    • rename full.gaf to disk.gaf (default filename)
    • make machine id a separate column (next to hostname)
    • make current/desired version monospace to align the ids
    • Ingest: log remote IP, log hostname, log user-agent
    • implement machine id pattern matching support (currently only doing exact matches)
    • [later: delete individual images?]
  • gokrazy UI: currently shows build timestamp as version, should show sbom as version and build timestamp FYI
  • GAF archive format writing
    • gok overwrite
  • gok push
    • gok push to GUS local disk
  • gok gus
    • gok gus diff
    • gok gus set/unset
  • github.com/gokrazy/selfupdate client
    • later: inhibit mechanism to prevent updates at a bad time (HTTP API on localhost)
  • later: rate-limiting policy (update one at a time? pause if any update fails)

Plugin support is tracked in #173

stapelberg added a commit that referenced this issue Feb 5, 2023
stapelberg added a commit that referenced this issue Feb 5, 2023
If the --gus_server command line flag is configured,
this program will periodically check in with GUS (gokrazy Update Service).

related to #167
stapelberg added a commit to gokrazy/internal that referenced this issue Feb 5, 2023
heartbeat is a no-op if not configured.

related to gokrazy/gokrazy#167
stapelberg added a commit to gokrazy/tools that referenced this issue Feb 5, 2023
stapelberg added a commit to gokrazy/tools that referenced this issue Feb 5, 2023
stapelberg added a commit to gokrazy/gus that referenced this issue Feb 5, 2023
stapelberg added a commit to gokrazy/gus that referenced this issue Feb 5, 2023
stapelberg added a commit to gokrazy/gus that referenced this issue Feb 5, 2023
stapelberg added a commit to gokrazy/gus that referenced this issue Feb 5, 2023
@jshield
Copy link

jshield commented Feb 9, 2023

a potential enhancement to this would be being able to serve root/kernel/initrd/firmware over the network via nbd/nfs/styx/9p to do test boots for bakery validation, I'd imagine as part of early init the device could ask the update server if it should boot over the network.

@stapelberg
Copy link
Contributor Author

a potential enhancement to this would be being able to serve root/kernel/initrd/firmware over the network via nbd/nfs/styx/9p to do test boots for bakery validation, I'd imagine as part of early init the device could ask the update server if it should boot over the network.

That is certainly possible, and I have considered netbooting in the past.

The big issue however is that I really want the “boot from SD card” workflow to work 100% of the time, and the only way to achieve that is to exercise exactly this flow in our CI setup.

This might sound over-the-top, but I have seen behavioral differences on the Raspberry Pi depending on which boot mechanism was used: rebooting just hangs indefinitely when my Pi 4 is (was?) booted from a USB stick instead of an SD card. That’s a shame, because writing images to the USB stick is much faster, but if I get a different set of bugs, it doesn’t make sense to develop with that setup.

I have also diagnosed Linux kernel bugs that only happened when you run the Pi headless, without an HDMI monitor connected.

Hence, my current stance is that we should use a standard setup as much as possible, that is as close to the recommended setup (boot from SD card, with/without a monitor connected) as possible.

@jshield
Copy link

jshield commented Feb 9, 2023

No that makes complete sense to me, SD boot should be the guaranteed boot mode and thus tested out of the box.

I'd still see utility in it for devices that don't have much local storage as a primary boot mechanism, I was considering trying to do something with some of the devices that I have that are currently running OpenWRT today, I have an old Wireless Access Point that has a decent amount of RAM (128MB) but little flash (16MB), the smallest I could get the core services in gokrazy was about 9MB by additionally stripping the binaries as part of the build, but storing it on the flash is impractical, I was able to copy the squashed root to tmpfs, mount and pivot however.

On the otherhand I could also potentially just write a service that would configure it over SSH and UCI and deploy that onto one of the other gokrazy devices.

also as an aside given you have the interfaces.json definition for gokrazy/rtr7, have you seen http://netjson.org/ it appears to be an attempt to standardize something similar and have features for fleet management that might be a useful addition to this as well.

stapelberg added a commit to gokrazy/tools that referenced this issue Feb 19, 2023
stapelberg added a commit to gokrazy/gus that referenced this issue Feb 19, 2023
@philhug
Copy link

philhug commented Feb 21, 2023

would it be possible to make the update service more generic to allow it to also return config files as well (or is this already part of the sbom?) and maybe allow other commands like reboot, exec, ...
this would make it possible for user story 5 to completely disable the web interface on the gokrazy instance.

also I think the Api calls could be reduced to one, by just returning the command payload (update_firmware, update_config, reboot, .. in the heartbeat response and putting the attempt_update payload in the heartbeat message.

@stapelberg
Copy link
Contributor Author

I'd still see utility in it for devices that don't have much local storage as a primary boot mechanism, I was considering trying to do something with some of the devices that I have that are currently running OpenWRT today, I have an old Wireless Access Point that has a decent amount of RAM (128MB) but little flash (16MB), the smallest I could get the core services in gokrazy was about 9MB by additionally stripping the binaries as part of the build, but storing it on the flash is impractical, I was able to copy the squashed root to tmpfs, mount and pivot however.

Impressive!

On the otherhand I could also potentially just write a service that would configure it over SSH and UCI and deploy that onto one of the other gokrazy devices.

Yep, that seems like a good architecture.

also as an aside given you have the interfaces.json definition for gokrazy/rtr7, have you seen http://netjson.org/ it appears to be an attempt to standardize something similar and have features for fleet management that might be a useful addition to this as well.

Thanks for the pointer!

@stapelberg
Copy link
Contributor Author

would it be possible to make the update service more generic to allow it to also return config files as well (or is this already part of the sbom?)

You can run gok sbom to see what’s part of the SBOM and what isn’t. For each file, the path and content hash goes into the SBOM, but not file contents. Conceptually, the SBOM is a bill of what went into the build, not instructions for how to build something.

I think you’re talking about the gokrazy instance config. Distributing this file alone is not very useful — to build a gokrazy image, you currently need:

  1. the instance config
  2. the go.mod/go.sum files in builddir, and any locally modified working copies
  3. any referenced extrafiles

You’re welcome to store all of these files in a private git repository and distribute that in a way that’s convenient for you.

I don’t think this is a useful addition to GUS in the general case, though.

and maybe allow other commands like reboot, exec, ... this would make it possible for user story 5 to completely disable the web interface on the gokrazy instance.

You always need the web interface to do updates, even when said update is done through the selfupdate program.

If you want to interactively control your device, just install tailscale to make the web interface available. This doesn’t need to go through GUS.

also I think the Api calls could be reduced to one, by just returning the command payload (update_firmware, update_config, reboot, .. in the heartbeat response and putting the attempt_update payload in the heartbeat message.

While that’s technically true, I don’t think it would be good design. I don’t see a reason to deviate from small, targeted requests/responses (see https://github.com/gokrazy/gokapi for an up-to-date description and docs browser).

@elgohr
Copy link

elgohr commented Nov 26, 2023

Thank you for the effort!
Maybe you could have a look at https://www.nerves-hub.org/ for inspiration - if you didn't already 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants