Support Compose file version 2 (for runtime: nvidia
)
#241
Comments
@thomas-riccardi On our systems (researcher machines, and GPU cluster nodes) we've worked around the removal of the // Snippet from "/etc/docker/daemon.json" on my machine
{
// ...
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
} We use nvidia-docker2 on all systems. |
I'm a bit hesitant to add support for the V2 compose file format; the V2 format has various features that are targeted at local development, and can be non-portable, thus problematic when deploying an application. Adding
Orthogonal, but specific to the use-case here; there's still discussion around "how" to expose GPUs to containers; the nvidia "runtime" is currently a thin wrapper around the default runc runtime. In future it may not be needed to have this wrapper (an make this work out of the box) (should not be a blocker for implementing |
/cc @3XX0 @flx42 The lack of runtime selection is going to create portabity issues when you build images on the same machine with docker compose with GPU cause it requires mandatory default Nvidia runtime. |
I also referenced this issue at docker/compose#6239 |
@thaJeztah Would it be possible to have a "runtimes" section like there are "volumes" and "networks" sections? I imagine something like this:
Modifying |
Taking that approach would assume that every machine in the cluster would be able to have Can you elaborate why configuring the engine/daemon as part of the installation process for the nvidia runtime is not a suitable approach? |
@thaJeztah E.g. what about when you want to run only some of the containers with the nvidia runtime? |
What if you do not want the nvidia runtime as the default runtime? Unfortunately I do not know enough about the inner workings to know what sort of tradeoffs are involved with setting the default runtime to nvidia if you are not running GPU jobs (just because the machine has GPU and docker, doesn't mean it's only running gpu-job containers). But I do know more software engaged == more things to possibly break, so it seems to me that setting all containers to default to nvidia is not the most optimal. It would be nice to define the runtime parameters in daemon.json but have the default be regular runtime, and specify in dc.yml which runtime to use.
I see your point about portability, putting the hard path in compose could be limiting. But... that's still no guarantee of portability. E.g. I have containers which can't run on nvidia >400, and 3 machines with 1080's on 3XX drivers and a 2080ti that's on 410. I think a possible solution might be the generic resource field. edit: I had my syntax wrong, I think I need to play around with the example file. ... Okay, this might be a workable alternative to
Kinda clunky in comparison to a single k:v, but I think this could solve another pain of mine, so, might be worth it. |
If docker-app choose to limit docker-app to version 3.x it is a design choice to drive development towards Swarm. CNAB itself has no such goals.
https://github.com/deislabs/cnab-spec/blob/master/100-CNAB.md |
RE: "What if you do not want the nvidia runtime as the default runtime?", that's what we're hitting for a couple reasons:
|
Description
It would be useful to support Compose file version 2. The use-case would be to use
runtime: nvidia
to access the local GPUs from docker-compose services containers.(
runtime
option is not supported on Compose file version 3, see docker/compose#5360 (comment))Steps to reproduce the issue:
docker-compose.yaml
withversion: '2.4'
docker-app init foo
docker-app render
Describe the results you received:
Error: failed to load Compose file: unsupported Compose file version: 2.4
Describe the results you expected:
docker-app render
works with Compose file version 2.Output of
docker-app version
:The text was updated successfully, but these errors were encountered: