Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Unable to mount filesystem volumes in containerd 1.7.0 due to https://github.com/microsoft/hcsshim/pull/1344 #1699

Open
hach-que opened this issue Mar 18, 2023 · 13 comments

Comments

@hach-que
Copy link

#1344 causes containerd to upgrade HostProcess jobs to silos, which in turn breaks host process containers that use filesystem drivers like WinFsp. HostProcess containers now fail to start with:

Error: failed to create containerd task: failed to create shim task: failed to bind target "\\\\?\\Volume{473e53c8-c55d-11ed-9601-ba2be9735a86}\\" to root "C:\\UnrealEngine" for job object: Do not attach the filter to the volume at this time.: unknown

in the pod events when containerd 1.7.0 is used. Previously the HostProcess containers would start successfully.

@TBBle
Copy link
Contributor

TBBle commented Jun 4, 2023

As I understand, #1344 means microsoft/Windows-Containers#335 affects host-process containers too now.

I saw a suggestion to make it opt-out with an annotation but I guess that didn't go anywhere.

@hach-que
Copy link
Author

hach-que commented Jun 4, 2023

Correct. As of 1.7.0 it is now impossible to schedule any work that relies on filesystem drivers under containerd/Kubernetes, and HostProcess can no longer be used to "escape" the limitations of normal containers for this type of workload.

@kiashok
Copy link
Contributor

kiashok commented Feb 14, 2024

The behavior being pointed out is not a regression. Starting with containerd/1.7, bind volume mounts are being used (instead of symlinks like in containerd/1.6) . More details about the same can be found here: https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/1981-windows-privileged-container-support#compatibility , https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/1981-windows-privileged-container-support#container-mounts .

PR #1344 achieves the above behavior in containerd/1.7 by elevating the job objects to a partial silo so that we can make sure of the silo local file bindings that bind filter supports. This was a conscious decision made.
Having said that, we are aware of some known issues with the approach taken for containerd/1.7+ and it is being tracked here microsoft/Windows-Containers#366 .

I believe that what you could be facing is also due to a similar reason - that is, something is executing in host context and is probably being passed a path that is accessible only from the guest and not the host.
One temporary work around for this issue is to copy the files onto the host (that is, outside of C:\hpc) and then run the container. Could you please try this workaround and let us know if it works for you?
We do understand that this is not ideal and one would lose benefits of filesystem isolation etc but this is the best workaround for right now while we are looking into a more permanent fix for this approach. We will share more details about the fix once we have it.

We do not want to support an annotation on hcsshim to rollback to the old behavior to workaround this issue like in this PR: #2022

cc @msscotb @fady-azmy-msft

@kiashok
Copy link
Contributor

kiashok commented Feb 14, 2024

cc @fjs4

@hach-que
Copy link
Author

hach-que commented Feb 15, 2024

The issue isn't caused by paths - 1.6 had the same issue in that you can not use or access virtual filesystem drivers inside normal containers. Even when you mirror the filesystem layout on the host it doesn't work, because it has to do with the way that silos and filesystem drivers interact in the kernel. The same root cause prevents ProjFS from running inside a silo even when it's installed on the host.

When this regression originally occurred in 1.7, I did extensively search for a workaround and one simply does not exist. Only turning off the silo itself to behave like 1.6 allows virtual filesystems to run again.

I believe there's a related issue that prevents virtual filesystems from working at runtime (instead of at container creation, which is what the error in the original post shows). If you mount a virtual filesystem into a host process container after the container is created, it still doesn't work because - as far as I can tell - bindflt (or the filter that handles file access inside silos) doesn't support re-entering the filter pipeline for it's own file access and just does file access directly. This means containers can't see the files inside a mounted path even if you do get the mount folder itself to appear.

@hach-que
Copy link
Author

hach-que commented Feb 15, 2024

Sidenote: There's an argument that the fix should be to make silos work with virtual filesystems / filter drivers, but the Windows kernel team does not plan on implementing support for this, so we need an option at the hcsshim/containerd level instead.

@kiashok
Copy link
Contributor

kiashok commented Mar 8, 2024

@hach-que we are continuing to take a look at this at our end. Will share an update as soon as we have something to share.

@hach-que
Copy link
Author

hach-que commented Apr 3, 2024

@kiashok @ntrappe-msft Is there any update here? I would really like to not be stuck on 1.6 forever.

@kiashok
Copy link
Contributor

kiashok commented Apr 16, 2024

@hach-que sorry there were some higher priority items I had to focus on and got side tracked. I'll get back to you one this. Do you have a repro that you can share?
Btw, an unrelated question - this whole scenario works on 1.7 if your app is run directly on the host, correct?
Is there any motivation for containerizing the application?
So you have a virtual filesystem on the host and you are trying to access some files from there inside of your container - right? I have never looked into virtuL file systems with containers. Will revert back soon.

@hach-que
Copy link
Author

@kiashok So we can use daemonsets to deploy updates across a fleet of servers without having to roll our own centralized updating mechanism.

I would also like to properly containerize eventually, but this relies on the Windows Kernel team fixing filesystem filters in silos (the same issue that prevents HostProcess jobs from being silos without breaking stuff).

@kiashok
Copy link
Contributor

kiashok commented Apr 18, 2024

@hach-que I was trying to repro this locally and I am not able to repro the issue you are reporting. let me know if I am missing anything. it would be great if you can share a simple repro if you have one.

Running projFS locally on my machine using https://github.com/Microsoft/Windows-classic-samples/tree/main/Samples/ProjectedFileSystem . Then created an HPC and tried accessing the files from the folder where projFS app has been projected into and it works just fine!

Have you tried with projFS previously? Is the issue you are hitting only with WinFsp?

@kiashok
Copy link
Contributor

kiashok commented Apr 18, 2024

(the same issue that prevents HostProcess jobs from being silos without breaking stuff).

could you elaborate on what you mean when you say "(the same issue that prevents HostProcess jobs from being silos without breaking stuff)." ? I don't think I am aware of any issues that is preventing elevating HostProcess jobs to silos. Containerd/1.7 + does this for HPC.

@kiashok
Copy link
Contributor

kiashok commented Apr 18, 2024

@hach-que I was trying to repro this locally and I am not able to repro the issue you are reporting. let me know if I am missing anything. it would be great if you can share a simple repro if you have one.

Running projFS locally on my machine using https://github.com/Microsoft/Windows-classic-samples/tree/main/Samples/ProjectedFileSystem . Then created an HPC and tried accessing the files from the folder where projFS app has been projected into and it works just fine!

Have you tried with projFS previously? Is the issue you are hitting only with WinFsp?

I was also able to run the same projFS application as an HPC container and project onto a file in C: drive. I was able to mount this folder onto another container and access files from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants