Ephemeral invocations #166

davidchisnall · 2021-08-03T08:53:18Z

Is your feature request related to a problem? Please describe.
I want to run jobs in a throw-away jail that is reset to a previous state on exit. In most container systems, this is accomplished with an ephemeral layer over the top of a container image.

Describe potential alternatives or workaround you've considered (if any)

I currently wrap the pot invocation in a loop the rolls back to the previous snapshot each time. This isn't great for three reasons:

I want to be able to upgrade the immutable base image periodically and I can't do that while the jail is running.
Rolling back is a synchronous operation and so I can't restart the jail until it's finished, whereas destroying a clone can happen concurrently with taking a new clone of the same base FS.
I have to be really careful to make sure I do the rollback in all possible failure modes of the script.

Describe the feature you'd like to have

[] An ephemeral variant of pot start that cloned the filesystem, ran from the clone, and destroyed it at the end. If these clones live in a fixed part of the zpool namespace then pot can clean them up easily at the end.
[] A pot rename command so that I can atomically replace the immutable base image when I upgrade.
[] A mechanism to specify the zfs quota property for the ephemeral filesystem.

The text was updated successfully, but these errors were encountered:

grembo · 2021-08-03T13:27:19Z

While not being packaged in a feature, you can do something like this already, which should improve the situation:

# create base pot
pot create -p immutable -t single -b 13.0

# snapshot and clone
pot snapshot -p immutable
pot clone -P immutable -p mutable

# start derived pot
pot start mutable

# change stuff in immutable
echo "Changed some things" >/opt/pot/jails/foundation/m/blabla

# resnapshot and create new clone
pot snapshot -p immutable
pot clone -P immutable -p mutable_new

# stop old clone and move new clone into place
pot stop -p mutable
pot rename -p mutable -n mutable_old
pot rename -p mutable_new -n mutable
pot start mutable

# destroy old clone
pot destroy -p mutable_old

davidchisnall · 2021-08-03T14:28:16Z

Thanks, that sounds like it's enough for what I need. I missed the clone and rename commands.

davidchisnall · 2021-08-04T11:49:31Z

I've now done this. It would be nice to have an atomic destructive rename that pot protected from concurrent clones, but I can work around this by wrapping the rename in a script that I run while holding the same lock file that I hold while doing the clone.

davidchisnall · 2021-08-05T17:16:13Z

This actually doesn't do quite what I need, because the cloned invocation is linked to the original and so I can't replace the base one without stopping the running ones (which I don't want to do, I want them to gracefully exit).

A pot promote would do what I need, I believe. I can hack around this by doing the promote myself, but now I'm hard-coding pot's use of ZFS.

grembo · 2021-08-05T17:40:07Z

This actually doesn't do quite what I need, because the cloned invocation is linked to the original and so I can't replace > the base one without stopping the running ones (which I don't want to do, I want them to gracefully exit).

Is this a way of saying "I want to be able to rename a running pot"?

grembo · 2021-08-05T18:02:47Z

p.s. Can't you simply use clone + unique jail names (e.g., using uuids)? That's what the nomad plugin does when invoking "pot prepare".

davidchisnall · 2021-08-05T19:10:16Z

Is this a way of saying "I want to be able to rename a running pot"?

That might work.

p.s. Can't you simply use clone + unique jail names (e.g., using uuids)? That's what the nomad plugin does when invoking "pot prepare".

I was using UUIDs, but now it's some extra metadata I need to communicate and I can't use well-known names of the pots to check if they're running, send them signals, and so on.

The UUID doesn't actually help here though. If I clone pot A to pot A-{UUID}, then I can't destroy pot A because the cloned dataset of A-{UUID} is dependent on A. I can fix that with an explicit zfs promote, but now I'm manipulating pot-owned ZFS datasets underneath pot, which doesn't sound like a good idea.

grembo · 2021-08-05T19:16:34Z

But why would you want to destroy pot A? You can simply change/update/whatever in it and then do a new snapshot you can clone a new pot from (while keeping the old snapshot and running clones in place). Managing metadata is an extra burden for sure (but also not that hard). It’s all a bit theoretical without knowing more about what you’re actually trying to achieve.

davidchisnall · 2021-08-05T19:23:45Z

But why would you want to destroy pot A?

Because it's no longer required. To make things more concrete:

I create a pot containing a configured GitHub Actions runner and all of the dependencies for the tested code.
I create an ephemeral clone of this runner
It runs a single action, leaving it in a state where it's full of junk I want to throw away.
I delete the ephemeral clone and loop from step 2.

At the same time, I create a new base pot containing updated versions of compilers and things, and an updated base system with security vulnerabilities fixed. I want this to be picked up by the ephemeral pot as soon as it finishes running one job (I also prod it to exit if it's in the long-poll state and not currently running anything).

As soon as the new base image is ready and the runner has finished, the base dataset is no longer required and should be deleted. If the ephemeral pot's dataset is promoted, this is trivial (ZFS handles the reference counting of any blocks that are still referenced by both).

grembo · 2021-08-05T19:55:38Z

I would simply run a prune script for that :), but maybe @pizzamig has more inspiration/ideas?

pizzamig · 2021-08-17T10:45:10Z

Hi everyone, sorry, I'm a bit late.

If I understood it correctly, we have:

a base pot, used as base to create/run ephemeral pot
ephemeral pot, used as runner instance

Do you have one ephemeral per base or multiple ephemeral per base?

My observations:

you don't need to re-create the base pot from scratch all the times, you can just run local upgrade and take a new snapshot
the ephemeral pot is created cloning a snapshot. When a new base pot is available, you can simply create a new ephemeral pot cloning the new snapshot of the base pot
if you only have on instance of the ephemeral clone, you can take a snapshot before to run it for the first time and rollback, instead of destroy the pot

pot snapshot uses the UNIX epoch as snapshot
pot purge-snapshots can help to remove old snapshots
I can add a feature to clone (pot clone -s latest) to use the oldest/latest available snapshot, to avoid the external management of snapshot tags.

davidchisnall · 2021-08-17T11:46:52Z

Do you have one ephemeral per base or multiple ephemeral per base?

I have a single ephemeral one, other uses cases would want multiple ones.

you don't need to re-create the base pot from scratch all the times, you can just run local upgrade and take a new snapshot

That's definitely what I'd have done 10-20 years ago but it's definitely not recommended practice for modern operations. Container deployments are supposed to be deterministically created from a declarative recipe, not continually evolving.

the ephemeral pot is created cloning a snapshot. When a new base pot is available, you can simply create a new ephemeral pot cloning the new snapshot of the base pot

Yup, that's what I'm doing now, but I need to run a zfs promote on the underlying dataset, which means I need to rely on implementation details of pot.

if you only have on instance of the ephemeral clone, you can take a snapshot before to run it for the first time and rollback, instead of destroy the pot

That's what I was doing but rollback is a synchronous operation whereas destroying a clone can happen in the background.

pizzamig · 2021-08-17T12:57:41Z

I don't understand the need to run a zfs promote (I guess you want to promote the origin to the new snapshot)
Why the re-cloning of the ephemeral pot using the new snapshot is not enough? What am I missing?

pizzamig · 2021-08-17T14:21:12Z

I've just installed and successfully start a runner using with your scripts.
Now I understand your use case with upgrade:

deterministically create a new base (with sufiix -tmp)
zfs promote the ephemeral pot to remove its dependency from the old base snapshot
destroy the old base
rename the new base without the suffix -tmp
gracefully shutdown the ephemeral pot and the run-actions-runner with automatically recreate it

In other words, you would need a way to recreate the base (with the same name), without shutting down the ephemeral pot.

the zfs promote solution, however, could be complicated with multiple ephemeral pot (you would need to destroy the promoted ephemeral pot as the last one)

pizzamig · 2021-08-17T15:50:47Z

It seems that you can rename the pot base while the ephemeral pot is running (the zfs origin is updated accordingly).

So the upgrade process could be:

rename the base with suffix -old
deterministically create a new base (no suffix)
gracefully shutdown the ephemeral pot (and let run-action-runner recreate it using the new base
destroy the -old base

I will test it the entire process later this week (maybe submitting a PR to your project), but the pot rename -p base -n base-old seemed to work

davidchisnall · 2021-08-20T14:00:10Z

Thanks. The zfs promote thing prevents the cloned dataset being marked as a child of the original, which allows the original to be deleted without needing to synchronise with the running invocation. I'd rather avoid any serialisation here - CI jobs can be up to 6 hours with the standard GitHub policy, having the rename operation block for 6 hours would not be great.

grembo · 2022-12-15T21:46:41Z

Hi @davidchisnall, do you think it would make sense to revisit this requirement? (we made quite some progress structurally this year, so we might be in a better position to implement the feature now).

davidchisnall · 2023-01-11T21:06:11Z

Now that there's support for OCI containers on FreeBSD, I plan on moving my things over to that, so feel free to close this if no one else needs it.

davidchisnall added the feature label Aug 3, 2021

davidchisnall closed this as completed Aug 4, 2021

davidchisnall reopened this Aug 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ephemeral invocations #166

Ephemeral invocations #166

davidchisnall commented Aug 3, 2021

grembo commented Aug 3, 2021 •

edited

davidchisnall commented Aug 3, 2021

davidchisnall commented Aug 4, 2021

davidchisnall commented Aug 5, 2021

grembo commented Aug 5, 2021

grembo commented Aug 5, 2021

davidchisnall commented Aug 5, 2021

grembo commented Aug 5, 2021

davidchisnall commented Aug 5, 2021

grembo commented Aug 5, 2021

pizzamig commented Aug 17, 2021

davidchisnall commented Aug 17, 2021

pizzamig commented Aug 17, 2021

pizzamig commented Aug 17, 2021

pizzamig commented Aug 17, 2021

davidchisnall commented Aug 20, 2021

grembo commented Dec 15, 2022 •

edited

davidchisnall commented Jan 11, 2023

Ephemeral invocations #166

Ephemeral invocations #166

Comments

davidchisnall commented Aug 3, 2021

grembo commented Aug 3, 2021 • edited

davidchisnall commented Aug 3, 2021

davidchisnall commented Aug 4, 2021

davidchisnall commented Aug 5, 2021

grembo commented Aug 5, 2021

grembo commented Aug 5, 2021

davidchisnall commented Aug 5, 2021

grembo commented Aug 5, 2021

davidchisnall commented Aug 5, 2021

grembo commented Aug 5, 2021

pizzamig commented Aug 17, 2021

davidchisnall commented Aug 17, 2021

pizzamig commented Aug 17, 2021

pizzamig commented Aug 17, 2021

pizzamig commented Aug 17, 2021

davidchisnall commented Aug 20, 2021

grembo commented Dec 15, 2022 • edited

davidchisnall commented Jan 11, 2023

grembo commented Aug 3, 2021 •

edited

grembo commented Dec 15, 2022 •

edited