You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
now i have almost one week to work on this question. but i failed.
environment: redhat7.2
k8s:v1.3.10 cuda:v7.5 kernel version:367.44 tensorflow:0.11 gpu:1080
our platform based on tensorflow and k8s。it is for training about ML.
when use cpu, it's ok, but can't work on gpu,i want to know why.
i tested many examples you said,but still failed
my cluster: 1 master 2 node. every node has a gpu card, only master hasn't
first I test like @Hui-Zhi said :
vim test.yaml
apiVersion: v1
kind: Pod
metadata:
name: nvidia-gpu-test
spec:
containers:
- name: nvidia-gpu
image: nginx
resources:
limits:
alpha.kubernetes.io/nvidia-gpu: 1
yes, i tested, and it works. if i change nvidia-gpu: 1 to 2, failed. pod keeping pending. and describe found : no node can satisfied this .because every node has only one gpu card, i think it works.
but question is coming: how to run on gpu? this example only prove that k8s can get gpu,and know gpu,but how to run on it? how can i use yaml file to create one pod run on gpu resource?
then , i found another way: nvidia-docker
i pull gpu-image: gcr.io/tensorflow/tensorflow:0.11-gpu, and run mnist.py demo according to docker, docker run -it ${image} /bin/bash
but failed. something error like "can't open CUDA libarary libcuda.so, cant find libcuda.so ",
Whether someone has encountered the same problem?
then i found that someone said: gpu need use nvidia-docker
luckly i installed as tensorflow: https://www.tensorflow.org/install/install_linux#gpu_support said,accord to nvidia-docker i found my training run on gpu,and gpu memory almost 7g, almost 70%
i run like this: nvidia-docker run -it ${image} /bin/bash
python mnist.py
yes, it works. but a new question coming: should i use docker to run on cpu,and nvidia-docker on gpu? i just run on gpu only on docker , maybe nvidia-docker, but how to run gpu on k8s.
k8s container used docker but not nvidia-docker,so how can i to do this by the same way ,can you help me ? i want to know how to run gpu on k8s, not just a demo or a test yaml to prove k8s support gpu.
hopfully you can answer me,i'm waiting ....
thanks.
The text was updated successfully, but these errors were encountered:
now i have almost one week to work on this question. but i failed.
environment: redhat7.2
k8s:v1.3.10 cuda:v7.5 kernel version:367.44 tensorflow:0.11 gpu:1080
our platform based on tensorflow and k8s。it is for training about ML.
when use cpu, it's ok, but can't work on gpu,i want to know why.
i tested many examples you said,but still failed
my cluster: 1 master 2 node. every node has a gpu card, only master hasn't
first I test like @Hui-Zhi said :
yes, i tested, and it works. if i change nvidia-gpu: 1 to 2, failed. pod keeping pending. and describe found : no node can satisfied this .because every node has only one gpu card, i think it works.
but question is coming: how to run on gpu? this example only prove that k8s can get gpu,and know gpu,but how to run on it? how can i use yaml file to create one pod run on gpu resource?
then , i found another way: nvidia-docker
i pull gpu-image: gcr.io/tensorflow/tensorflow:0.11-gpu, and run mnist.py demo according to docker, docker run -it ${image} /bin/bash
but failed. something error like "can't open CUDA libarary libcuda.so, cant find libcuda.so ",
Whether someone has encountered the same problem?
then i found that someone said: gpu need use nvidia-docker
luckly i installed as tensorflow: https://www.tensorflow.org/install/install_linux#gpu_support said,accord to nvidia-docker i found my training run on gpu,and gpu memory almost 7g, almost 70%
i run like this: nvidia-docker run -it ${image} /bin/bash
python mnist.py
yes, it works. but a new question coming: should i use docker to run on cpu,and nvidia-docker on gpu? i just run on gpu only on docker , maybe nvidia-docker, but how to run gpu on k8s.
k8s container used docker but not nvidia-docker,so how can i to do this by the same way ,can you help me ? i want to know how to run gpu on k8s, not just a demo or a test yaml to prove k8s support gpu.
hopfully you can answer me,i'm waiting ....
thanks.
The text was updated successfully, but these errors were encountered: