Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance NFS Mount Efficiency with Stage/Unstage Volume Capability #573

Open
woehrl01 opened this issue Dec 23, 2023 · 6 comments
Open

Enhance NFS Mount Efficiency with Stage/Unstage Volume Capability #573

woehrl01 opened this issue Dec 23, 2023 · 6 comments

Comments

@woehrl01
Copy link
Contributor

woehrl01 commented Dec 23, 2023

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

I would like to propose an enhancement that focuses on optimizing NFS mount operations. This feature aims to improve resource utilization and reduce startup times for pods accessing NFS servers. A similar mounting behaviour exists on the ebs csidriver or juicefs csidriver.

The core idea is to introduce an option that leverages the stage and unstage volume capabilities of the CSI driver. The proposed changes include:

  • Single NFS Server Mount: Mount the NFS server only once for each unique combination of server name, export name, and mount options.

  • Bind Mounts for Pods: Implement actual bind mounts for each pod accessing the NFS server. This approach should also support subpaths for each pod.

  • Mount Management: Ensure that the mount operation occurs once per unique combination mentioned above, preventing redundant mounts (or simply the volumeid of the pv).

This enhancement brings several key benefits:

  • Reduced Mount Operations: By mounting the NFS server less frequently, we can significantly reduce the number of mount operations that the NFS server has to handle.

  • Improved Cache Utilization: With fewer mounts, cache usage becomes more efficient, enhancing overall system performance.

  • Faster Startup Times for Pods: Pods accessing the NFS server will experience quicker startup times, leading to more efficient deployments and scaling operations.

Describe alternatives you've considered

An alternative could be using a daemonset which mounts the nfs servers to the host, which then are bind mounted via hostpath into the pod. The problem is here that it hides the fact in the pod that a nfs is used and could be less reliable.

Additional context

@andyzhangx
Copy link
Member

andyzhangx commented Dec 24, 2023

@woehrl01 thanks for raising this issue. I agree that add NodeStageVolume support would reduce the nfs mount since it's per pv mount per node, while it would raise the other issue, e.g. NodeStageVolume does not respect fsGroupChangePolicy (SecurityConext support), NodePublishVolume does, you could find more details here: kubernetes-sigs/azurefile-csi-driver#1224 (comment)

There is a performance and k8s compliance tradeoff between whether supports NodeStageVolume or not, I am not sure what's the right way for such requirement.

cc @jsafrane @gnufied any ideas whether we need to implement NodeStageVolume or not?

@woehrl01
Copy link
Contributor Author

@andyzhangx thank you for mentioning this problem, I wasn't aware of this discussions, yet.

I'm curious if this actually is an issue in that case. If the stage volume is only creating the initial mount for the export root of the nfs server. The publish volume step can still set the fsgroup on the actual (sub) mount point which is created by the bind mount.

As I'm not an expert in fsgroup, what am I missing in that case?

@andyzhangx
Copy link
Member

@andyzhangx thank you for mentioning this problem, I wasn't aware of this discussions, yet.

I'm curious if this actually is an issue in that case. If the stage volume is only creating the initial mount for the export root of the nfs server. The publish volume step can still set the fsgroup on the actual (sub) mount point which is created by the bind mount.

As I'm not an expert in fsgroup, what am I missing in that case?

@woehrl01 support you have a nfs mount with gid=x, and then set gid=y in bind mount path, then the original nfs mount would also have gid=y

@woehrl01
Copy link
Contributor Author

@andyzhangx I see, thank you. That's an interesting behaviour I wasn't aware about.

I found https://bindfs.org/ which could be a possible solution for that bind mount behaviour.

It still would be great to have this option as a featureflag, if this behaviour of fsgroup is documented.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 23, 2024
@woehrl01
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants