Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use GPU: could not select device driver "nvidia" with capabilities: [[gpu]] #150

Open
ProgrammingLife opened this issue Feb 24, 2024 · 3 comments

Comments

@ProgrammingLife
Copy link

ProgrammingLife commented Feb 24, 2024

I've successfully use my RTX 3080ti with Stable Diffusion, Fooocus, Stable Cascade and my system is ready to work with GPU.
Arch Linux.

$ ./run.sh --model 7b --with-cuda
...
 ✔ Container llama-gpt-llama-gpt-ui-1             Recreated                                                                                                                      0.3s 
 ✔ Container llama-gpt-llama-gpt-api-cuda-ggml-1  Recreated                                                                                                                      0.3s 
Attaching to llama-gpt-api-cuda-ggml-1, llama-gpt-ui-1
llama-gpt-ui-1             | [INFO  wait] --------------------------------------------------------
llama-gpt-ui-1             | [INFO  wait]  docker-compose-wait 2.12.1
llama-gpt-ui-1             | [INFO  wait] ---------------------------
llama-gpt-ui-1             | [DEBUG wait] Starting with configuration:
llama-gpt-ui-1             | [DEBUG wait]  - Hosts to be waiting for: [llama-gpt-api-cuda-ggml:8000]
llama-gpt-ui-1             | [DEBUG wait]  - Paths to be waiting for: []
llama-gpt-ui-1             | [DEBUG wait]  - Timeout before failure: 3600 seconds 
llama-gpt-ui-1             | [DEBUG wait]  - TCP connection timeout before retry: 5 seconds 
llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time before checking for hosts/paths availability: 0 seconds
llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time once all hosts/paths are available: 0 seconds
llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time between retries: 1 seconds
llama-gpt-ui-1             | [DEBUG wait] --------------------------------------------------------
llama-gpt-ui-1             | [INFO  wait] Checking availability of host [llama-gpt-api-cuda-ggml:8000]
Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]
Gracefully stopping... (press Ctrl+C again to force)

$ nvidia-smi
Sat Feb 24 17:49:50 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3080 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   36C    P8              10W / 125W |     10MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2044      G   /usr/lib/Xorg                                 4MiB |
+---------------------------------------------------------------------------------------+

What should I check?

@henryruhs
Copy link

nvidia-ctk runtime configure
systemctl restart docker

@Haui1112
Copy link

This worked for me, thanks! Now I'm at "forward compatibility was attempted on non supported HW"

llama-gpt-api-cuda-ggml-1 |
llama-gpt-api-cuda-ggml-1 | /models/llama-2-7b-chat.bin model found.
llama-gpt-api-cuda-ggml-1 | make: *** No rule to make target 'build'. Stop.
llama-gpt-api-cuda-ggml-1 | Initializing server with:
llama-gpt-api-cuda-ggml-1 | Batch size: 2096
llama-gpt-api-cuda-ggml-1 | Number of CPU threads: 24
llama-gpt-api-cuda-ggml-1 | Number of GPU layers: 10
llama-gpt-api-cuda-ggml-1 | Context window: 4096
llama-gpt-api-cuda-ggml-1 | CUDA error 804 at /tmp/pip-install-7rxfzzup/llama-cpp-python_c62cf07cbfa449a7b268f9102316d6db/vendor/llama.cpp/ggml-cuda.cu:4883: forward compatibility was attempted on non supported HW

Setup:
System:
Host: TowerPC Kernel: 6.1.0-18-amd64 arch: x86_64 bits: 64
Desktop: KDE Plasma v: 5.27.5 Distro: Debian GNU/Linux 12 (bookworm)
Machine:
Type: Desktop Mobo: ASUSTeK model: PRIME X299-A II v: Rev 1.xx
serial: BIOS: American Megatrends v: 0702
date: 06/10/2020
CPU:
Info: 12-core model: Intel Core i9-10920X bits: 64 type: MT MCP cache:
L2: 12 MiB
Speed (MHz): avg: 1201 min/max: 1200/4600:4800:4700 cores: 1: 1200 2: 1201
3: 1200 4: 1200 5: 1200 6: 1200 7: 1206 8: 1200 9: 1200 10: 1232 11: 1200
12: 1200 13: 1200 14: 1200 15: 1200 16: 1200 17: 1200 18: 1200 19: 1200
20: 1200 21: 1200 22: 1200 23: 1200 24: 1200
Graphics:
Device-1: NVIDIA GA104 [GeForce RTX 3060 Ti Lite Hash Rate] driver: nvidia
v: 525.147.05

@kmanwar89
Copy link

kmanwar89 commented Mar 27, 2024

I'm seeing this same error in a 1080 Ti, but when issuing nvidia-smi, the result is

❯ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 550.54

I followed the instructions from nVidia's CUDA page here to install the CUDA drivers, but I'm guessing there's a driver mismatch somewhere..

Edit Miraculously, a reboot didn't break my system and I now see similar output in the nvidia-smi command. However, I don't have an nvidia-ctk command that I can run, so I remain stuck(ish)

Edit 2 - A quick search indicated I needed to install the nVidia Container Toolkit, and restart docker. The errors have now gone away.

I was able to bring up the docker-compose-cuda-ggml.yml file using the command docker compose -f docker-compose-cuda-ggml.yml up -d - however, the other cuda compose (gguf) did not work for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants