Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Error response from daemon: unknown or invalid runtime name: nvidia #117

Open
maplepy opened this issue Dec 15, 2023 · 5 comments
Open
Labels
status:awaiting-triage type:bug Something isn't working

Comments

@maplepy
Copy link

maplepy commented Dec 15, 2023

Describe the Bug

Error response from daemon: unknown or invalid runtime name: nvidia

Steps to Reproduce

  1. Run the docker with nvidia
  2. get error

Expected Behavior

Run the docker successfully

Screenshots

No response

Relevant Settings

Same config as #116 but with Dockerruntime set to nvidia

Version

not displayed but probably Build: [2023-12-09 02:38:11] [master] [6cc9f56] [debian]

Platform

Arch Linux 6.6.6-arch1-1
NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3
Docker version 24.0.7, build afdd53b4e3
Docker Compose version 2.23.3

Relevant log output

sudo docker-compose up --force-recreate
[+] Running 0/0
 ⠋ Container steam-headless-steam-headless-1  Recreate                       0.0s 
Error response from daemon: unknown or invalid runtime name: nvidia
@maplepy maplepy added status:awaiting-triage type:bug Something isn't working labels Dec 15, 2023
@maplepy
Copy link
Author

maplepy commented Dec 15, 2023

sudo docker info | grep Runtime

 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: nvidia

it was previously

 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc

Note that changing this resulted in

sudo docker-compose up --force-recreate
[+] Running 1/1
 ✔ Container steam-headless-steam-headless-1  Recreated                      0.1s 
Attaching to steam-headless-1
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/moby/773e337cdf64f3b64802433e5a4657f0ec8104ee9937a47873bf04eb6d7efa8b/log.json: no such file or directory): fork/exec /usr/bin/nvidia-container-runtime: no such file or directory: unknown

@p5-f20w18k
Copy link

Not helpful I know - but I also had the same issue a few hours ago when trying to get this running, same OS, same config

@dexd85
Copy link

dexd85 commented Dec 16, 2023

Hi,

had a similar issue from beginning and for my was a important hint missing: You have to install and configure at the first the Nvidia Contrainer Toolkit on your host - it seems not to be installed on your machine....please follow the instructions in this Link:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

@p5-f20w18k
Copy link

p5-f20w18k commented Dec 16, 2023

Hi,

had a similar issue from beginning and for my was a important hint missing: You have to install and configure at the first the Nvidia Contrainer Toolkit on your host - it seems not to be installed on your machine....please follow the instructions in this Link: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

I do have it installed, tested with docker run --gpus all nvidia/cuda:12.1.1-runtime-ubuntu22.04 nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3070        Off | 00000000:0E:00.0 Off |                  N/A |
|  0%   24C    P8              12W / 240W |     13MiB /  8192MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

Edit: my error ends with 'runtime did not terminate successfully'

I think this is a docker/nvidia related issue instead of an issue with this container.

@p5-f20w18k
Copy link

sudo docker info | grep Runtime

 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: nvidia

it was previously

 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc

Note that changing this resulted in

sudo docker-compose up --force-recreate
[+] Running 1/1
 ✔ Container steam-headless-steam-headless-1  Recreated                      0.1s 
Attaching to steam-headless-1
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/moby/773e337cdf64f3b64802433e5a4657f0ec8104ee9937a47873bf04eb6d7efa8b/log.json: no such file or directory): fork/exec /usr/bin/nvidia-container-runtime: no such file or directory: unknown

See [here]https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/issues/17) for runtime errors, arch wiki was also helpful.

My compose file below, it may not be fully correct yet, dont know if its working as ive only just ran docker-compose up. Just documenting my workflow of this.

--- services: steam-headless: image: josh5/steam-headless:latest restart: unless-stopped runtime: ${DOCKER_RUNTIME} shm_size: ${SHM_SIZE} ipc: host # Could also be set to 'shareable' ulimits: nofile: soft: 1024 hard: 524288 cap_add: - NET_ADMIN - SYS_ADMIN - SYS_NICE security_opt: - seccomp:unconfined - apparmor:unconfined network_mode: host hostname: ${NAME} extra_hosts: - "${NAME}:127.0.0.1" env_file: .env # devices: # - /dev/fuse # - /dev/uinput # - driver: nvidia # count: 1 # capabilities: [gpu] devices: - /dev/fuse - /dev/uinput - /dev/dri/renderD128:/dev/dri/renderD128 - /dev/dri/card0:/dev/dri/card0 - /dev/nvidia0:/dev/nvidia0 - /dev/nvidiactl:/dev/nvidiactl - /dev/nvidia-uvm:/dev/nvidia-uvm # - driver = nvidia # count: 1 # capabilities: [gpu] device_cgroup_rules: - 'c 13:* rmw' volumes: - /opt/container-data/steam-headless/home/:/home/default/:rw - /mnt/games/:/mnt/games/:rw - /opt/container-data/steam-headless/.X11-unix/:/tmp/.X11-unix/:rw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:awaiting-triage type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants