Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

Example of nvidia-docker2 with docker-compose #568

Closed
silent-vim opened this issue Dec 12, 2017 · 10 comments
Closed

Example of nvidia-docker2 with docker-compose #568

silent-vim opened this issue Dec 12, 2017 · 10 comments

Comments

@silent-vim
Copy link

1. Issue or feature description

Hi, I am getting started with nvidia-docker. I have been able to install the dependencies and can see the output for nvidia-smi run through nvidia-docker. I am trying to see how to use this within a docker-compose file. I went through the documentation but could not find any example of specifying the runtime. It will be great if you can provide some pointers around this.

Most of the tutorials also work only with nvidia docker v1. Example: http://collabnix.com/deploying-application-in-the-gpu-accelerated-data-center-using-docker/

2. Steps to reproduce the issue

nvidia-docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Tue Dec 12 20:59:58 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M60           Off  | 0000030C:00:00.0 Off |                  Off |
| N/A   43C    P0    40W / 150W |      0MiB /  8123MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

3. Information to attach (optional if deemed irrelevant)

$ nvidia-docker version
NVIDIA Docker: 2.0.1
Client:
 Version:      17.09.1-ce
 API version:  1.32
 Go version:   go1.8.3
 Git commit:   19e2cf6
 Built:        Thu Dec  7 22:24:23 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.09.1-ce
 API version:  1.32 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   19e2cf6
 Built:        Thu Dec  7 22:23:00 2017
 OS/Arch:      linux/amd64
 Experimental: false
@flx42
Copy link
Member

flx42 commented Dec 12, 2017

It's work in progress for docker-compose:
docker/compose#5405

In the meantime, you can set our runtime as the default runtime, and it will work.

@silent-vim
Copy link
Author

Thanks so I looked at the container that is created, it has nvcc as I can run the command and get the details but nvidia-smi results in command not found

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
airflow@448a3bdde1d7:~$ which nvcc
/usr/local/cuda/bin/nvcc

Also under the daemon.json in /etc/docker I can see the following.

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

Is there anything I need to do specifically to make nvidia the default runtime ?

@flx42
Copy link
Member

flx42 commented Dec 12, 2017

Add "default-runtime": "nvidia",

@silent-vim
Copy link
Author

Awesome @flx42 that worked! Thanks to you and @3XX0 for quick help 👍

airflow@42539f02d3e2:~$ nvidia-smi
Tue Dec 12 23:10:16 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M60           Off  | 0000030C:00:00.0 Off |                  Off |
| N/A   46C    P0    41W / 150W |      0MiB /  8123MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

@ngreenwald89
Copy link

Where exactly do you place the "default-runtime": "nvidia"? I put it inside "nvidia" object and still get same error:

{ "runtimes": { "nvidia": { "path": "nvidia-container-runtime", "default-runtime": "nvidia", "runtimeArgs": [] } } }

@flx42
Copy link
Member

flx42 commented Sep 7, 2018

@ngreenwald89 there is an example here: https://github.com/NVIDIA/k8s-device-plugin#preparing-your-gpu-nodes

@casperdcl casperdcl mentioned this issue May 23, 2019
19 tasks
@carlwain74
Copy link

carlwain74 commented Jun 21, 2019

@ngreenwald89 For me the path needed /usr/bin/ otherwise it never worked

root@machine:PerfTest# cat /etc/docker/daemon.json
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

@HWiese1980
Copy link

Is this still work in progress? More than one and a half years later?

@qhaas
Copy link

qhaas commented Jul 31, 2019

From my reading of nvidia-docker documentation, using docker run --runtime nvidia and setting default-runtime to 'nvidia' in '/etc/docker/daemon.json' appear to be functionality of the now deprecated nvidia-docker2 package. Hopefully, we will have an alternative method to using docker-compose that doesn't require deprecated features:

Note that with the release of Docker 19.03, usage of nvidia-docker2 packages are deprecated since NVIDIA GPUs are now natively supported as devices in the Docker runtime

UPDATE:
Here is what I did to get it working, based on advice in this issue: docker/compose#6691

  1. Uninstalled deprecated nvidia-docker2 packages (just to be sure, I removed all packages from nvidia container related repositories, then removed the nvidia container repos themselves)
  2. Deployed nvidia-continer-runtime via its repository and used the systemd override approach

The downside is that you have to have root to make these changes, but then again, if someone adds you to the 'docker' group, they likely trust you.

The upside is, we are no longer limited to docker-compose v2.3

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants