You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Check for cuda by running nvcc --version. It will fail to find the command.
Expected Behavior
rpm-ostree and nvidia-smi show that cuda and cuda toolkit should be installed, however nvcc --version fails to work.
reap@fedora:~$ nvidia-smi
Thu Feb 22 18:58:12 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 ||-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |||| MIG M. ||=========================================+======================+======================|| 0 NVIDIA RTX 4000 SFF Ada ... Off | 00000000:01:00.0 Off | Off || 30% 33C P8 5W / 70W | 2MiB / 20475MiB | 0% Default |||| N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: || GPU GI CI PID Type Process name GPU Memory || ID ID Usage ||=======================================================================================|| No running processes found |
+---------------------------------------------------------------------------------------+
reap@fedora:~$ rpm -qa | grep nvidia
nvidia-gpu-firmware-20240115-2.fc39.noarch
ublue-os-nvidia-addons-0.10-1.fc39.noarch
xorg-x11-drv-nvidia-cuda-libs-545.29.06-2.fc39.x86_64
nvidia-modprobe-545.29.06-1.fc39.x86_64
nvidia-persistenced-545.29.06-1.fc39.x86_64
nvidia-container-toolkit-base-1.14.5-1.x86_64
libnvidia-container1-1.14.5-1.x86_64
libnvidia-container-tools-1.14.5-1.x86_64
nvidia-container-toolkit-1.14.5-1.x86_64
xorg-x11-drv-nvidia-kmodsrc-545.29.06-2.fc39.x86_64
libva-nvidia-driver-0.0.11-1.fc39.x86_64
xorg-x11-drv-nvidia-libs-545.29.06-2.fc39.i686
xorg-x11-drv-nvidia-libs-545.29.06-2.fc39.x86_64
nvidia-settings-545.29.06-1.fc39.x86_64
xorg-x11-drv-nvidia-power-545.29.06-2.fc39.x86_64
kmod-nvidia-6.7.5-201.fsync.fc39.x86_64-545.29.06-3.fc39.x86_64
xorg-x11-drv-nvidia-545.29.06-2.fc39.x86_64
xorg-x11-drv-nvidia-cuda-libs-545.29.06-2.fc39.i686
xorg-x11-drv-nvidia-cuda-545.29.06-2.fc39.x86_64
xorg-x11-drv-nvidia-devel-545.29.06-2.fc39.x86_64
reap@fedora:~$ nvcc --version
# only works after the workaround
Hardware
B550I Aurus Pro AX
AMD Ryzen 7 5700G
Nvidia RTX 4000 SFF Ada Gen
2x32GB @ 3200 MHz
2TB NVME Drive
Setup Notes
Secureboot is disabled in the BIOS.
OS and KDE run on the AMD GPU. Steam Games are able to successfully launch on the Nvidia gpu.
After applying the workaround PyTorch is also able to successfully run on the Nvidia gpu.
The Workaround
note :: The workaround does not fix the issue for podman containers running with CDI. Any cuda required workloads will have to be run in the userspace.
$ nvidia-smi
# this shows the correct output and says that cuda 12.3 is installed
$ nvcc --version
# this should fail to find nvcc
$ ls /etc/local
# this output does not contain cuda which confirms that the cuda toolkit is not installed
$ wget https://developer.download.nvidia.com/compute/cuda/12.3.2/local_installers/cuda_12.3.2_545.23.08_linux.run
$ sudo sh cuda_12.3.2_545.23.08_linux.run
# this will require you to accept the licence first. You should only be installing the cuda drivers as the system already has nvidia drivers.
$ ls /etc/local
# now we have the cuda toolkit, but nvcc will still fail as it is not on your path# add this to your ~/.bashrc so that it is loaded every boot
$ export PATH=/usr/local/cuda-12.3/bin${PATH:+:${PATH}}
$ export LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
$ nvcc --version
# nvcc now works
Steps To Recreate
bazzite-nvidia
.nvcc --version
. It will fail to find the command.Expected Behavior
rpm-ostree
andnvidia-smi
show that cuda and cuda toolkit should be installed, howevernvcc --version
fails to work.Hardware
B550I Aurus Pro AX
AMD Ryzen 7 5700G
Nvidia RTX 4000 SFF Ada Gen
2x32GB @ 3200 MHz
2TB NVME Drive
Setup Notes
The Workaround
note :: The workaround does not fix the issue for podman containers running with CDI. Any cuda required workloads will have to be run in the userspace.
Related Issues
The text was updated successfully, but these errors were encountered: