Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL unavailable in Singularity with custom CUDA installation path and additional opencl related libraries from linux repository #6497

Closed
vedifredi opened this issue May 8, 2023 · 1 comment

Comments

@vedifredi
Copy link

Version of Singularity:

$ singularity --version
singularity-ce version 3.11.1

Expected behavior

Having OpenCL GPU capabilities available inside a sif file with a proprietary simulation software Yasara just like directly executed on the host machine or also successfully in a Singularity v3.5.2 image few years back (on a host system without custom CUDA path).

Actual behavior

The proprietary software throws OpenCL compiler encountered an error, unfortunately without further specification.

What OS/distro are you running

$ cat /etc/os-release
NAME="Linux Mint"
VERSION="20.1 (Ulyssa)"
ID=linuxmint
ID_LIKE=ubuntu
PRETTY_NAME="Linux Mint 20.1"
VERSION_ID="20.1"
HOME_URL="https://www.linuxmint.com/"
SUPPORT_URL="https://forums.linuxmint.com/"
BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/"
PRIVACY_POLICY_URL="https://www.linuxmint.com/"
VERSION_CODENAME=ulyssa
UBUNTU_CODENAME=focal

How did you install Singularity

By source.

Installed NVIDIA/CUDA versions

  • NVIDIA 530.41.03 installed using apt
  • CUDA 12.1 installed from run file at: /usr/local/cuda-12.1 (requires driver >=525.60.13)

nvidia-smi executed on host:

$ nvidia-smi
Mon May  8 11:56:31 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050         Off| 00000000:01:00.0 Off |                  N/A |
| N/A   65C    P0               N/A /  N/A|   1350MiB /  4096MiB |     51%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1150      G   /usr/lib/xorg/Xorg                          258MiB |
|    0   N/A  N/A      2720      G   cinnamon                                     51MiB |
|    0   N/A  N/A      2776      G   /usr/lib/firefox/firefox                    162MiB |
|    0   N/A  N/A      2792      G   /usr/lib/thunderbird/thunderbird            114MiB |
|    0   N/A  N/A     12429    C+G   ./yasara                                    756MiB |
+---------------------------------------------------------------------------------------+

Inside Singularity container:

> nvidia-smi executed in Singularity
Mon May  8 12:06:18 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
...

Detailed description

I'm assuming an issue with duplicate occurrences of OpenCL relevant files, since 3 libraries listed in nvliblist.conf are present twice on this host system:

/usr/local/cuda-12.1/targets/x86_64-linux/lib/stubs/libcuda.so
/usr/local/cuda-12.1/targets/x86_64-linux/lib/stubs/libnvidia-ml.so
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libOpenCL.so [.1, .1.0, .1.0.0]

/usr/lib/x86_64-linux-gnu/libcuda.so [.1, .530.41.03]
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so [.1, .530.41.03]
/usr/lib/x86_64-linux-gnu/libOpenCL.so.1 [.1.0.0]

Since repository-based OpenCL libraries are needed by few tools installed from the linux repos, there must be a way to tell Singularity, which library versions to use for OpenCL/GPU calculations inside the container. Using the --nv flag and the absolute path /usr/local/cuda-12.1/targets/x86_64-linux/lib/libOpenCL.so in nvliblist.conf is obviously not sufficient.

Executed directly on my host (where the GPU is utilized properly) clinfo gives me lots of output, but interestingly, without any ICD related lines at the end. However, there's a note saying: "NOTE: your OpenCL library only supports OpenCL 2.2":

Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 12.1.98
  Platform Profile                                FULL_PROFILE
...
NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform
	NOTE:	your OpenCL library only supports OpenCL 2.2,
		but some installed platforms support OpenCL 3.0.
		Programs using 3.0 features may crash
		or behave unexpectedly

Executed inside Singularity, there ICD related lines but the final note mentions OpenCL version 2.1:

singularity exec --cleanenv --nv --bind /etc/OpenCL --overlay image-overlay.img image.sif /bin/bash
Singularity> clinfo
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 12.1.98
  Platform Profile                                FULL_PROFILE
...
NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  NVIDIA CUDA
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [NV]
  clCreateContext(NULL, ...) [default]            Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1
	NOTE:	your OpenCL library only supports OpenCL 2.1,
		but some installed platforms support OpenCL 3.0.
		Programs using 3.0 features may crash
		or behave unexpectedly

And finally, if I in addition bind the entire CUDA lib path to /nvlib (added to LD_LIBRARY_PATH inside the container), I get the same clinfo output as on the host itself (no ICD lines, OpenCL 2.2):

singularity exec --cleanenv --nv --bind /etc/OpenCL --bind /usr/local/cuda-12.1/targets/x86_64-linux/lib:/nvlib --overlay image-overlay.img image.sif /bin/bash
Singularity> clinfo
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 12.1.98
  Platform Profile                                FULL_PROFILE
...
NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform
	NOTE:	your OpenCL library only supports OpenCL 2.2,
		but some installed platforms support OpenCL 3.0.
		Programs using 3.0 features may crash
		or behave unexpectedly

However, the issue in Yasara execution remains. The singularity installation instructions for OpenCL usage are not straight-forward enough at least for me. Can someone please shed light on which libraries needs to come from which tool or might be missing/perturbing in my singularity system?

@github-actions
Copy link

github-actions bot commented May 8, 2023

New issues are no longer accepted in this repository. If singularity --version says singularity-ce, submit instead to https://github.com/sylabs/singularity, otherwise submit to https://github.com/apptainer/apptainer.

@github-actions github-actions bot closed this as completed May 8, 2023
@github-actions github-actions bot locked and limited conversation to collaborators May 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant