You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running with --nv apptainer needs to map EGL ICD /usr/share/glvnd/egl_vendor.d/10_nvidia.json same as NVIDIA Container Toolkit is doing per NVIDIA/nvidia-docker#1520 (comment)
If not EGL initialisation fails (with eglinfo)
Actual behavior
/usr/share/glvnd/egl_vendor.d/10_nvidia.json is not mapped and eglinfo fails
Steps to reproduce this behavior
install apptainer on a host with Nvidia GPU and proprietary drivers
create a sif file apptainer build --fakeroot eglinfo.sif eglinfo.recipe
Looking at this and trying to gather more information.
First, I will note that 10_nvidia.json does seem to be part of the driver. If I extract a .run file and look at the contents it's there. So it does seem like a good candidate to automatically add into the container.
However, we don't currently have a method for adding random files. We can only add libraries and binaries. To add libraries, we search /etc/ld.so.cache for the appropriate library locations. To add binaries I believe we just search the $PATH. I don't think that we should assume the location of the driver installation, so the question becomes "How do we locate (in a performant way) the 10_nvidia.json file on a particular system?"
I'm tempted to suggest that this is rarely needed (since I think this is the first request that I'm aware of) and it should therefore be something the user binds if they need to. Perhaps fix with documentation? Unsure.
Any idea how the NVIDIA container toolkit tackles this issue?
Version of Apptainer
apptainer version 1.2.5
Expected behavior
When running with
--nv
apptainer needs to map EGL ICD/usr/share/glvnd/egl_vendor.d/10_nvidia.json
same as NVIDIA Container Toolkit is doing per NVIDIA/nvidia-docker#1520 (comment)If not EGL initialisation fails (with
eglinfo
)Actual behavior
/usr/share/glvnd/egl_vendor.d/10_nvidia.json
is not mapped andeglinfo
failsSteps to reproduce this behavior
apptainer build --fakeroot eglinfo.sif eglinfo.recipe
apptainer run --nv eglinfo.sif eglinfo
(without the mapping)apptainer run --nv -B /usr/share/glvnd/egl_vendor.d/10_nvidia.json eglinfo.sif eglinfo
(with manually mapping the definition json)What OS/distro are you running
How did you install Apptainer
from source
The text was updated successfully, but these errors were encountered: