Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-15849 control: Add client uid map to agent config #14381

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mjmac
Copy link
Contributor

@mjmac mjmac commented May 15, 2024

Allow daos_agent to optionally handle unresolvable client
uids via custom mapping. In deployments where the agent
may not have access to the same user namespace as client
applications (e.g. in containerized deployments), the
client_user_map can provide a fallback mechanism for
resolving the client uids to known usernames for the
purpose of applying ACL permissions tests.

Example agent config:

credential_config:
  client_user_map:
    default:
      user: nobody
      group: nobody
    1000:
      user: joe
      group: blow

Features: control
Required-githooks: true
Change-Id: I72905ccc5ddee27fc2101aa4358a14e352c86253
Signed-off-by: Michael MacDonald mjmac@google.com

Copy link

Ticket title is 'Provide a mechanism for mapping unknown client uids to user/group names'
Status is 'In Review'
https://daosio.atlassian.net/browse/DAOS-15849

@daosbuild1
Copy link
Collaborator

Test stage Unit Test on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-14381/1/display/redirect

@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium UCX Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14381/4/execution/node/1378/log

@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14381/4/testReport/

@mjmac mjmac marked this pull request as ready for review May 21, 2024 13:47
@mjmac mjmac requested review from a team as code owners May 21, 2024 13:47
Allow daos_agent to optionally handle unresolvable client
uids via custom mapping. In deployments where the agent
may not have access to the same user namespace as client
applications (e.g. in containerized deployments), the
client_user_map can provide a fallback mechanism for
resolving the client uids to known usernames for the
purpose of applying ACL permissions tests.

Example agent config:

credential_config:
  client_user_map:
    default:
      user: nobody
      group: nobody
    1000:
      user: joe
      group: blow

Features: control
Required-githooks: true
Change-Id: I72905ccc5ddee27fc2101aa4358a14e352c86253
Signed-off-by: Michael MacDonald <mjmac@google.com>
kjacque
kjacque previously approved these changes May 21, 2024
Copy link
Contributor

@kjacque kjacque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't block on any of my suggestions. This is pretty cool. Nice work.

I find myself wondering if there might be some benefit to making use of interface types for this scenario. Having two different data sources for user/group/etc., it seems like a good fit for the concept.

src/control/security/config_test.go Show resolved Hide resolved
src/control/lib/control/pool_test.go Show resolved Hide resolved
src/control/security/auth/auth_sys.go Outdated Show resolved Hide resolved
@mjmac mjmac requested a review from tanabarr May 22, 2024 13:24
Copy link
Contributor

@tanabarr tanabarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this change won't enable any undesired security consequences outside container environments? The prospect of nefarious mappings seems like something we really should think about, but perhaps it has? Is the assumption that the mappings can only be applied by an administrator who has write access to the configuration files and there aren't any other vectors that we need to be concerned about?

Also the Commit message seems to me to imply that the changes are only required for testing purposes: "...client_user_map can provide a fallback mechanism for resolving the client uids to known usernames for the purpose of applying ACL permissions tests.". I assume that this is a typo (or me misunderstanding) and the changes are indeed needed for production deployments?

src/control/security/auth/auth_sys.go Outdated Show resolved Hide resolved
src/control/security/auth/auth_sys.go Outdated Show resolved Hide resolved
@kjacque
Copy link
Contributor

kjacque commented May 22, 2024

Are we sure this change won't enable any undesired security consequences outside container environments? The prospect of nefarious mappings seems like something we really should think about, but perhaps it has? Is the assumption that the mappings can only be applied by an administrator who has write access to the configuration files and there aren't any other vectors that we need to be concerned about?

I thought about this for a bit, too. As long as the admin secures their DAOS config files and certificates from unauthorized users, I think we should be OK. If the admin has secured the system with certificates with appropriate permissions (private key readable by daos_agent only), attackers will not be able to start their own private daos_agent to interface with the system with a custom config either. I see these as the two main attack vectors. @dpquigl can you think of anything else?

@tanabarr tanabarr self-requested a review May 29, 2024 21:34
tanabarr
tanabarr previously approved these changes May 29, 2024
Copy link
Contributor

@tanabarr tanabarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerns assuaged by @dpquigl 's comments elsewhere:

So if I understand this correctly you have a multi-container environment where the socket for daos-agent is exported from one container and used in another. So when you check the sender’s credentials on the socket you’re getting he euid and egid from the process in the other container. However since the daos-agent container probably only has the users needed for the applications running in the container you can’t do a lookup on the users and groups. This seems like a reasonable solution. NFSv4 has a similar mechanism where it will translate a local uid to a name and back into a local uid on the remote side because of just this issue. The agent is a trusted entity in the security architecture so I agree that as long as you keep the agent integrity its ok for it to take on the additional responsibility of performing the translation. Is the reason this is in the same config file just for convenience? It might be a good idea to have it in a separate file for manageability.

I think moving the new section to a separate file is probably something we don't have bandwidth for currently so I'm happy to merge this now.

@mjmac
Copy link
Contributor Author

mjmac commented Jun 4, 2024

Are we sure this change won't enable any undesired security consequences outside container environments? The prospect of nefarious mappings seems like something we really should think about, but perhaps it has? Is the assumption that the mappings can only be applied by an administrator who has write access to the configuration files and there aren't any other vectors that we need to be concerned about?

As Kris & David have pointed out, the agent is considered a trusted component, and only the admin should be able to modify its configuration. As long as the agent trust is maintained, then there's no concern that malicious mappings could be created. If the agent trust is broken (due to file permissions or whatever), then pretty much anything could happen.

Also the Commit message seems to me to imply that the changes are only required for testing purposes: "...client_user_map can provide a fallback mechanism for resolving the client uids to known usernames for the purpose of applying ACL permissions tests.". I assume that this is a typo (or me misunderstanding) and the changes are indeed needed for production deployments?

Perhaps there's a better term, but I was using "test" in the sense that an access request is tested by applying the ACL and either accepting or rejecting the request. Maybe "check" would have been clearer.

Features: control
Required-githooks: true

Change-Id: Ia2d217ff7ff929e82ec78daaa3db43be0a55d964
Signed-off-by: Michael MacDonald <mjmac@google.com>
@mjmac mjmac dismissed stale reviews from tanabarr and kjacque via 77d0203 June 5, 2024 12:53
@mjmac mjmac requested review from tanabarr and kjacque June 5, 2024 12:53
@daltonbohning
Copy link
Contributor

FYI a fix for the DEB build just landed to master

mjmac added 2 commits June 5, 2024 18:14
Change-Id: I8560115e277dfed151795863285f4f9fb399d20a
Features: control
Required-githooks: true

Change-Id: I32ffae298415415a94709ffd680c1c96c46084fa
Signed-off-by: Michael MacDonald <mjmac@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants