Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #4714: correctly check thread can bind to privileged ports #4728

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yscialom
Copy link

@yscialom yscialom commented Jul 9, 2022

Context

In order for a non-root user to run AdGuardHome and allow it to bind to privileged ports (e.g. DNS tcp/53), linux offers a set of features collectively called capabilities. It can be simply described1 but is hard to master. And I don't pretend I do.

Issue

As detailed in issues/4714, AduardHome starts by checking it has permission to bind to privileged ports. Unfortunately, the way that check is done makes it return a false negative in some conditions.

Change

This PR fixes this test, and allow non-root users to run AdGuardHome in a docker container.

Details

How it was

In the modified file, one can find the function definition of canBindPrivilegedPorts(). It delegues the check to unix.PrctlRetInt, a simple bind to Linux' prctl(). The code used to search the capability CAP_NET_BIND_SERVICE in the ambient set (PR_CAP_AMBIENT_IS_SET).

But... even though the thread has this capability, CAP_NET_BIND_SERVICE has not (yet) been raised into its ambiant set, and the test fails leading to a premature exit of the program.

How it will be

The fix is rather simple: we raise the capability to the ambient set.

If the call succeeds, the thread has indeed the privilege to bind to those ports. If it fails, it doesn't. If we happen by some spaghettisation to call multiple times canBindPrivilegedPorts(), the way capabilities and sets of capabilities have been implemented guaranties nothing happens.

Alternatives

"Belt and Braces"

An alternative would be to both raise then check:

func allowBindPrivilegedPorts() (can bool, err error) {
	cnbs, err := unix.PrctlRetInt(
		unix.PR_CAP_AMBIENT,
		unix.PR_CAP_AMBIENT_RAISE,
		unix.CAP_NET_BIND_SERVICE,
		0,
		0,
	)
	// Don't check the error because it's always nil on Linux.
	adm, _ := aghos.HaveAdminRights()

	return cnbs == 1 || adm, err
}

func canBindPrivilegedPorts() (can bool, err error) {
	cnbs, err := unix.PrctlRetInt(
		unix.PR_CAP_AMBIENT,
		unix.PR_CAP_AMBIENT_IS_SET,
		unix.CAP_NET_BIND_SERVICE,
		0,
		0,
	)
	// Don't check the error because it's always nil on Linux.
	adm, _ := aghos.HaveAdminRights()

	return cnbs == 1 || adm, err
}

IMHO, this brings nothing of value. Let me know if you disagree.

"Can I bind to 53? Let's find out!"

A simpler alternative would be to drop completly the idea to ask if we can, but simply do and see what comes.

It could be done pre-emptively:

func canBindPrivilegedPorts() (can bool, err error) {
    // bind to port 53
    // unbind
    return bound_ret, err
}

As I'm not a GO developer, I couldn't implement this alternative. If you'd prefer it, please express it by providing a PR.


1 man capabilities

For the purpose of performing permission checks, traditional UNIX
implementations distinguish two categories of processes:
privileged processes (whose effective user ID is 0, referred to
as superuser or root), and unprivileged processes (whose
effective UID is nonzero). Privileged processes bypass all
kernel permission checks, while unprivileged processes are
subject to full permission checking based on the process's
credentials (usually: effective UID, effective GID, and
supplementary group list).

Starting with kernel 2.2, Linux divides the privileges
traditionally associated with superuser into distinct units,
known as capabilities, which can be independently enabled and
disabled. Capabilities are a per-thread attribute.

@yscialom
Copy link
Author

yscialom commented Aug 1, 2022

@ainar-g & @EugeneOne1 Code review please?

@EugeneOne1
Copy link
Member

@yscialom, thanks for the contribution and sorry for a late response. The PR is seemingly ok, but we'd like to extensively test it on different devices before merging.

BTW, your code introduces not a lot of changes and is fully compliant with our guildelines. Here they are in case you interested.

@EugeneOne1
Copy link
Member

EugeneOne1 commented Aug 9, 2022

@yscialom, hello again. A couple of thigs to clarify about the implementation:

  1. Does it fix issue 4714 for your setup?
  2. Does it work with no capability manipulations at all?

The PR is really good at stating the problem but it seems our current implementation requires a different approach. We suggest:

  1. Put raising the permissions into a separate function in the same file internal/aghnet/net_linux.go. It should return the error if the capabilities couldn't be raised. For the rest OSes it should always return nil.
    Wrap it with exported function (e.g. AcquirePermissions) in some OS-independent file (internal/aghnet/net.go would be right). This is a widespread pattern of wrapping the OS-dependent code, check the aghnet.CanBindPrivilegedPorts for an example.
  2. Call the new exported function in the home.checkPermissions just above the aghnet.CanBindPrivilegedPorts call. If it fails, the error should be printed to the Info log and be followed by the aghnet.CanBindPrivilegedPorts. The success makes calling the aghnet.CanBindPrivilegedPorts redundant.

Also, I believe the check for having the net_bind_service capability should be rewritten to:

res, err := unix.PrctlRetInt(
	unix.PR_CAPBSET_READ,
	unix.CAP_NET_BIND_SERVICE,
	0,
	0,
	0,
)

What do you think about the solution?

@ainar-g ainar-g modified the milestones: v0.107.10, v0.107.11, v0.107.12 Aug 17, 2022
@yscialom
Copy link
Author

yscialom commented Aug 23, 2022

Hello @EugeneOne1, sorry for the delay, I'm coming back from holidays.

Does it fix issue 4714 for your setup?

Yes it does.
 

Does it work with no capability manipulations at all?

Shame on me, I didn't even try. Though I suppose it does, with high level of confidence. Unfortunatly, I'm not in a position where I can build AGH right now, unless you can provide me with a docker image to do so.
 

What do you think about the solution?

I agree it is a net (pun intended) improvement.
 

Regarding the two improvements you suggested on the code structural organisation, I cannot give any advice on them: I'm not a Go developper and I'm unfamiliar with your codebase.

With that being said, how can I help you going forward?

Here is my endgoal: a PR to allow the default docker image of AGH to be runnable as a non-root user by simply defining the environment variable PUID (à la linuxserver.io).

@ainar-g ainar-g removed their assignment Aug 31, 2022
@ainar-g ainar-g modified the milestones: v0.107.12, v0.107.13 Sep 7, 2022
@ainar-g ainar-g modified the milestones: v0.107.13, v0.107.14 Sep 14, 2022
@ainar-g ainar-g modified the milestones: v0.107.14, v0.107.15 Sep 29, 2022
@EugeneOne1 EugeneOne1 modified the milestones: v0.107.15, v0.107.16 Oct 3, 2022
@yscialom
Copy link
Author

@EugeneOne1 Hello, what is the status of this PR? I understand your suggestions but also I am unable to implement them. Would you rather settle for a suboptimal & working solution or for a perfect one only you could code?

@EugeneOne1
Copy link
Member

@yscialom, hello and sorry for the delay. The developers team is unfortunately busy for the moment, but we're keeping this PR opened to merge it in future with our suggestions reconsidered and implemented. We'll mention you here and in the release notes, when it'll be merged. Thank you for the contribution.

@yscialom
Copy link
Author

No worries. I understand this is not high priority, and to be honnest I'm happy to see it kept for later merge. Keep the good work there!

@abnayak
Copy link

abnayak commented Jan 15, 2023

Any plan to merge this fix soon?

@EugeneOne1
Copy link
Member

@yscialom, hello and apologies for late response. We've pushed the modified version of this PR to the separate branch. Could you please try to build AdGuard Home from the latest commit (3918789) and tell if it works in the described scenario? If it doesn't, we'd also like to check out verbose logs.

By the way, which purpose does entrypoint.sh file serve in the referenced PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants