Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot set sysctl's in a host network namespace without --privileged #43769

Closed
kaysond opened this issue Jul 5, 2022 · 9 comments
Closed

Cannot set sysctl's in a host network namespace without --privileged #43769

kaysond opened this issue Jul 5, 2022 · 9 comments
Labels
area/networking area/swarm kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@kaysond
Copy link

kaysond commented Jul 5, 2022

Description

docker-ingress-routing-daemon is a solution to the issue of swarm obscuring packet source ips (#25526). We've been discussing the option of deploying this via docker swarm (see here), and one of the things that has come up is the inability to set sysctls on the ingress_sbox network namespace inside the swarm service, because it does not yet support --privileged (#25303)

As far as I can tell, with the right cap_add's, it should theoretically be possible to set the sysctls, but it seems to be thwarted by the fact that docker mounts /proc/sys as read-only inside containers. Unmounting it to expose the parent /proc mount, which is rw, doesn't help because then you get a permission denied.

Is there a way to tell a docker service not to protect /proc/sys without using --privileged (which isnt supported)?

Alternatively, are there any current plans to add --privileged support to docker service create? moby/swarmkit#1722 was closed for #32801, but that seems to be dead.

@thaJeztah
Copy link
Member

Effectively, this looks to be "use a service to re-configure the host machine"? Wondering; would that be more something for a provisioning script that runs when creating the node

@thaJeztah thaJeztah added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/networking area/swarm labels Jul 6, 2022
@thaJeztah
Copy link
Member

./cc @evol262 @dperny

@evol262
Copy link

evol262 commented Jul 6, 2022

I tend to agree -- the workaround here is not great, but it's a well-known way to sidestep this.

I appreciate that adding the node to swarm and having it happen automatically is easier, but it's also a potentially dangerous foot gun for many users if something slips into CD with unintended consequences. A container which actually modifies the host should be an intentional operation. Adding --privileged to a multi-node orchestrator probably would require adding entitlements and/or more granular RBAC system to provide any measure of safety.

I see that you already found the same workaround @kaysond. Did it work?

@kaysond
Copy link
Author

kaysond commented Jul 6, 2022

Effectively, this looks to be "use a service to re-configure the host machine"? Wondering; would that be more something for a provisioning script that runs when creating the node

Sort of. It's not really changing the host machine. It's configuring sysctl in docker's ingress_sbox namespace only, so its directly relevant to the swarm in a way that I think warrants making the change from within a container. Without the settings being correct in that namespace, the routing performance is significantly degraded.

Haven't tried the workaround yet since I'm hoping for something cleaner.

Does that mean its not possible otherwise?

@thaJeztah
Copy link
Member

@evol262 would this be possible when creating a custom ingress network? (#31714)

@evol262
Copy link

evol262 commented Jul 7, 2022

Arguably, yes, though the spec would need to be extended. I'm not sure that doing so would resolve this use case, though, since the service itself won't be able to create the ingress when deployed to swarm

@kaysond
Copy link
Author

kaysond commented Jul 7, 2022

So I guess our best bet is still docker service create --privileged then? Is there anything blocking this add? Or is it just a matter of a resource to actually do it?

@evol262
Copy link

evol262 commented Jul 7, 2022

An enormous amount of scoping and work around a sane entitlements/security/RBAC story to make that less dangerous.

I think we talked past each other somehow. I think the best bet is still going to be the workaround of "create a service which bind mounts the docker socket, use that to re-nsenter into the host's namespace, set sysctls there". It's ugly, a little hacky, container breakouts to configure the host aren't really an intended/supported use case, but any kind of timeline for "let me create a service which can manipulate settings on the container host via docker service create --privileged, adding options which allow creating networks from service definitions which set user-provided sysctls when the networks are created on hosts" or other solutions are so complex they may not land in any reasonable timeframe.

It would certainly make your use case easier, but in doing so, it adds a huge amount of complexity and potential risk for others. In a IaaS/CaaC environment, being able to lock down what users can do so they cannot manipulate hosts in a way in which they can access information from other tenants, for example, is a necessary prerequisite to any "officially supported" mechanism.

@kaysond
Copy link
Author

kaysond commented Jul 7, 2022

Got it! Thanks for the discussion.

@kaysond kaysond closed this as completed Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking area/swarm kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests

3 participants