-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
o/hookstate/ctlcmd: queue service commands if run from default-configure hook #13960
base: master
Are you sure you want to change the base?
o/hookstate/ctlcmd: queue service commands if run from default-configure hook #13960
Conversation
…re hook Queue "snapctl restart ..." and "snapctl start ..." commands to be run after default-configure similar to configure hook. This is to avoid a problem where the service doesn't see a new value if it uses "snapctl get ...", because it's still not commited by default-configure hook. Fixes: https://bugs.launchpad.net/snapd/+bug/2047949 Signed-off-by: Zeyad Gouda <zeyad.gouda@canonical.com>
153e4fd
to
757129b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Please make sure to link updated documentation on the snapcraft forum.
This fix just makes the implementation consistent with the documented behavior https://snapcraft.io/docs/supported-snap-hooks#heading--default-configure and the snapctl docs don't mention this behavior at all. Do you have a specific doc page in mind that would need updating? |
It's not clear why the user still wants to start services from within the default configure hook. My understanding is the the reason for this was specific to the order of configure-hook VS starting services as part of the do-install flow which should not be required anymore unless I am missing something. A secondary more subtle point, it does not seem like a good idea to start/interfere with service commands before do-install start service is done. Longer term we might want explicitly limit snapctl powers depending on how far the install sequence progressed to avoid creative permutations that cause problems. |
The reason why the user have service disabled still: They use the snap for core and for classic, and in classic there is no concept of seeding configuration with gadget. They rely on the user manually modifying the configuration and starting the service, so it is required to have it disabled by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some question/comments
@@ -149,7 +149,7 @@ func runServiceCommand(context *hookstate.Context, inst *servicestate.Instructio | |||
return err | |||
} | |||
|
|||
if !context.IsEphemeral() && context.HookName() == "configure" { | |||
if !context.IsEphemeral() && (context.HookName() == "configure" || context.HookName() == "default-configure") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do the tasks queued after default-configure exactly end up? it's not very transparent from tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are queued at the end after all install tasks
snapd/overlord/hookstate/ctlcmd/helpers.go
Lines 114 to 124 in 0bf3e39
for _, ts := range tts { | |
for _, t := range tasks { | |
// queue service command after all tasks, except for final tasks which must come after service commands | |
if finalTasks[t.Kind()] { | |
t.WaitAll(ts) | |
} else { | |
ts.WaitFor(t) | |
} | |
} | |
change.AddAll(ts) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added clarifying comments in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean after the configure hook? does this solve the original issue people were having in ways they expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, after both the configure hook and the health-check hook.
For illustration this is the expected order of tasks for installing a snap which is used in the unit tests:
"prerequisites",
"download-snap",
"validate-snap",
"mount-snap",
"copy-snap-data",
"setup-profiles",
"link-snap",
"auto-connect",
"set-auto-aliases",
"setup-aliases",
"run-hook[install]",
"run-hook[default-configure]",
"start-snap-services",
"run-hook[configure]",
"run-hook[check-health]",
"exec-command", <---- task for snapctl command
"service-control", <---- task for service operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, we probably want to ask the reporter on the bugs if this is too late for them or ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this solve the original issue people were having in ways they expect?
Yes, I tested it locally using the snap they provided. I left a comment there but I am waiting for feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, we probably want to ask the reporter on the bugs if this is too late for them or ok?
Good point, I will check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ZeyadYasser for the fix and your explanations.
The use case is from a snap that has a one-shot service doing several firewall and network configuration to set up a Thread bridge (linked in the bug report). As mentioned in #13960 (comment), the services are disabled by default, so that the user (on classic but also a pre-built Core image) gets a chance to configure for e.g. the networking interface before starting the services. However, the same snap needs to start automatically on Core when the config is seeded from a gadget. (A workaround was suggested by @zyga to keep services enabled by default but not run them when the needed config isn't set; this could work well for a limited config set.)
In our case, with the workflow proposed in this PR, it could work as follows:
default-configure
hook callssnapctl start
if theautostart
config option is set to true.configure
hook does some validation (not done indefault-configure
to avoid redundant checks).- services start.
But I suppose this isn't valid in every use case, as some applications expect the configure
hook to execute while the services are running.
In summary, while the proposed order isn't an issue for our use case, I believe starting the services right after the hook succeeds makes this more inline with the snap best practices.
Side note: I wonder why run-hook[check-health]
is ordered between configure
and exec-command
/service-control
tasks.
Another solution I can think of is to add the ability to just enable/disable services via snapctl
without starting or stopping them. Then, the default-configure
hook could simply call for e.g. the snapctl enable --service <service>
(snapctl enable
does something else) command (without needing to queue) and the following task (start-snap-services
) should just take care of the startup. I've already discussed this with @ernestl, and I'm not sure if this is going to work. I have experimented by calling snapctl stop --disable <service>
from install
and default-configure
hooks for an enabled service, and it works as expected: the service gets disabled and never starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the earliest we can start those services is after snap-start-services, @ZeyadYasser could you look what would be the complexity of queuing commands from default-configure there?
And add comments explaining tasks relative order. Signed-off-by: Zeyad Gouda <zeyad.gouda@canonical.com>
Signed-off-by: Zeyad Gouda <zeyad.gouda@canonical.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine with me assuming it solves the original problems as is
This PR queues
snapctl
services' commands to be run after thedefault-configure
similar to what is done for configure hook #4070. This is to avoid a problem where the service doesn't see a new value if it usessnapctl get ...
, because it's still not committed bydefault-configure
hook similar to what was described in LP#2047949 .Fixes: https://bugs.launchpad.net/snapd/+bug/2047949