DIGITS IN CPU MODE #251

yawadugyamfi · 2015-08-31T22:41:30Z

First of all is it possible to run DIGITS in CPU mode only? When I run DIGITS in CPU mode, I get this error:
"Creating layer mnist
check failed: error = cudaSuccess (35 vs 0) CUDA driver version is insufficient for CUDA runtime version.

Before running DIGITS, i edited the Makefile.config in caffe to use CPU.

lukeyeager · 2015-08-31T23:28:36Z

You need to update your driver, apparently.

What version of the CUDA toolkit do you have?

$ ls -l /usr/local/cuda

What driver version do you have?

$ nvidia-smi

yawadugyamfi · 2015-09-01T12:56:29Z

I am using cuda-7.0.
So when I issued: nvidia-smi, this is the result I got
" NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. "

I have installed DIGITS on a virtual drive thats why it can't communicate with the NVIDIA driver on my system.. I am confused why DIGITS will need my drivers if I want to run the program in CPU mode only..

lukeyeager · 2015-09-01T16:11:39Z

Why do you have CUDA installed at all if you want to run in CPU mode only? If you build Caffe with CPU_ONLY and tell DIGITS not to use any GPUs by running digits-devserver with the --config flag, that should solve your problem.

yawadugyamfi · 2015-09-02T14:08:45Z

Thanks a lot that worked.. However, I run into another error while training my model in CPU mode only..

ERROR: Check failed: * ptr host allocation of size 190515200 failed.

This is the full output..

conv2 needs backward computation.
pool1 needs backward computation.
norm1 needs backward computation.
relu1 needs backward computation.
conv1 needs backward computation.
label_data_1_split does not need backward computation.
data does not need backward computation.
This network produces output accuracy
This network produces output loss
Collecting Learning Rate and Weight Decay.
Network initialization done.
Memory required for data: 831529208
Solver scaffolding done.
Starting Optimization
Solving
Learning Rate Policy: step
Iteration 0, Testing net (#0)
Test net output #0: accuracy = 0.41
Test net output #1: loss = 1.09667 (* 1 = 1.16896 loss)
Check failed: *ptr host allocation of size 190515200 failed

lukeyeager · 2015-09-02T16:38:57Z

Now you're running out of memory. Decreasing your batch size should solve that problem.

I'm closing this issue since we've resolved the original problem.

tszjqgs · 2015-12-06T11:12:57Z

hi both , could any one of you please give the detail? I run into the very same issue,
And I used the DIGITS-2.0 version , I cd into the caffe folder ,change MakeFile.config and enable CPU_ONLY:=1(simply vi open it and save) and then when I run the model based on MNIST dataset , it still gives "ERROR: Check failed: error == cudaSuccess (35 vs. 0) CUDA driver version is insufficient for CUDA runtime version"

what about the digits-devserver with the --config flag ? just simply apply that under the digits2 folder ?
since I use the ./runme.sh for the digits2 before ...

thanks

lukeyeager · 2015-12-07T19:35:45Z

IIRC, you can still use the --config flag with the web installer. Try this: ./runme.sh --config. Choose "N" to select none.

fanser · 2016-01-04T07:40:32Z

@lukeyeager I face the same question in digit 3 vision.I look up your talks above all, and you said we can try two ways to figure out: The first way is to try: ./runme.sh --config (however I didn't find runme.sh file ), another way is by digits-devserver with the --config flag (I type this, and don't get any information about choosing CPU or GPUs ). So what's wrong?
thanks.
Best regard!

Follow is the return.

fzy@fzy-OptiPlex-3020:/usr/share/digits$ sudo ./digits-devserver -c
================================ Jobs Directory ================================
Where would you like to store job data?

Suggested values:
(*)  [Previous] /usr/share/digits/digits/jobs
(D)  [default]  /usr/share/digits/digits/jobs
Using "/usr/share/digits/digits/jobs"
modprobe: FATAL: Module nvidia not found.
cudaRuntimeGetVersion() failed with error #38
=================================== Log File ===================================
Where do you want the log files to be stored?

Suggested values:
(*)  [Previous] /var/log/digits/digits.log
(S)  [System]   /var/log/digits/digits.log
(D)  [default]  /usr/share/digits/digits/digits.log
(N)  [none]     <NONE>
Using "/var/log/digits/digits.log"
==================================== Caffe =====================================
Where is caffe installed?

Suggested values:
(*)  [Previous]        <PATHS>
(P)  [PATH/PYTHONPATH] <PATHS>
Using ""
==================================== Torch =====================================
Where is torch installed?

Suggested values:
(*)  [Previous]       <PATHS>
(P)  [PATH/TORCHPATH] <PATHS>
(N)  [none]           <NONE>

yawadugyamfi · 2016-01-04T14:56:09Z

I am trying to answer how you can choose CPU or GPUs..
Within the caffe folder, there is a Makefile.config.example file..
Copy the contents of this file into a new file and rename it as "Makefile.config".
If you want to use CPU, then

comment out the "USE_CUDNN :=1 Within "Makefile.config" file,
uncomment CPU_ONLY := 1
issue the make all command again within the caffe folder..

I think this should resolve your issue.

fanser · 2016-01-04T15:13:38Z

@yawadugyamfi thank for your answer.but i install digits by web installer, i don't find the caffe folder. So how did you figure out?

yawadugyamfi · 2016-01-04T16:02:24Z

You need caffe installed on your machine before you can run digits.

fanser · 2016-01-04T16:26:44Z

Did Caffe install at digits root path or home path ？ so is this way named 'build from source' witch differ from 'web installer'？And 'web installer ' likely install Caffe automatic, i don't know whether my thought is wrong .could we talk about it？it bother me for days.thanks!

yawadugyamfi · 2016-01-04T16:47:28Z

Sorry fanser, I really don't know about the web installer..
What i did was install caffe and then downloaded digits.. I didn't use the web installer.

lukeyeager · 2016-01-04T18:10:10Z

@fanser take a look at these instructions:

https://github.com/NVIDIA/DIGITS/blob/digits-3.0/docs/BuildDigits.md

It sounds like you may be trying to use a Caffe installation from the 2.0 web installer, and use the DIGITS 3.0 source from GitHub? That would be fine if you had a GPU and if you configured DIGITS to use the old Caffe installation properly. But it's probably easiest if you just rebuild Caffe to suit your needs (i.e. without CUDA).

tszjqgs · 2016-01-05T03:10:54Z

yes I did ,thank ! Now there are version 3 available,haven't tried yet :)

Best Regards

On 2016年1月4日, at 22:56, yawadugyamfi notifications@github.com wrote:

I am trying to answer how you can choose CPU or GPUs..
Within the caffe folder, there is a Makefile.config.example file..
Copy the contents of this file into a new file and rename it as "Makefile.config".
If you want to use CPU, then

comment out the "USE_CUDNN :=1 Within "Makefile.config" file,

uncomment CPU_ONLY := 1

issue the make all command again within the caffe folder..

I think this should resolve your issue.

―
Reply to this email directly or view it on GitHub.

fanser · 2016-01-05T03:22:30Z

@lukeyeager @yawadugyamfi thank you for two. And I wanna say sorry to @yawadugyamfi ,because It's should be called 'deb installer' not 'web installer'...
Maybe my question have a little bit mess, I should express more clearly.
On the first try . I choose the follow instructions, because of "Deb packages are provided for easy installation on Ubuntu 14.04"

https://github.com/NVIDIA/DIGITS/blob/digits-3.0/docs/UbuntuInstall.md

Install DIGITS step by step, and looks like success, because it can be running at http://localhost/ without error.
then,I wanna change GPUS to CPU-ONLY model. I type these copy from above link:

% cd /usr/share/digits

set new config

% sudo python -m digits.config.edit -v

however, I don't get any information about choosing CPU or GPU model. because my graphic card don't support CUDA, I only use CPU model. So I abandon the 'deb installer', try another way: 'build from source' . There are the instructions ,

https://github.com/NVIDIA/DIGITS/blob/digits-3.0/docs/BuildDigits.md

On the second try , I download DIGITS 3.0 source and NVIDIA/caffe master branch from GitHub. Then I copy and rename Makefile.config.example , and uncomment CPU_ONLY :=1. And then, I build and runtest Caffe without error. I think the building should succeed until now.
I needn't Torch, so don't install it of cause.
I start DIGITS successful, however, I get error when I train the MNIST model:

ERROR: Check failed: error == cudaSuccess (38 vs. 0) no CUDA-capable device is detected

I have no idea about how to make it work . so where is wrong?

Best Regards!

fanser · 2016-01-05T03:54:26Z

@lukeyeager @yawadugyamfi I succeed ! I just do the same things again! Thank for your suggestion and answer!

szm-R · 2016-09-16T17:06:02Z

Hi, I have built caffe in cpu only mode and in config chose None for gpu, now digits runs on cpu, but when I want to train detectNet it gives me this error:

I0916 21:14:16.657801 2798 layer_factory.hpp:77] Creating layer cluster
*** Aborted at 1474044257 (unix time) try "date -d @1474044257" if you are using GNU date ***
PC: @ 0x7f5c57ab289e (unknown)
*** SIGSEGV (@0x7f5cdec51c90) received by PID 2798 (TID 0x7f5ce61ad7c0) from PID 18446744073152044176; stack trace: ***
@ 0x7f5ce417fd40 (unknown)

I don't understand what it means, only that the problem seems to be with cluster layer!
I have successfully launched a classification training and it works.

lukeyeager · 2016-09-19T16:59:37Z

@szm2015 you're trying to train DetectNet on a CPU? I can't imagine that's much fun. Is it possible you're running out of memory?

szm-R · 2016-09-19T17:51:37Z

actually I'm using two laptops, one with a 850M GPU which I use for
training and testing, now because I wanted to be able to train two nets
simultaneously, I wanted to use the other one (which has a GPU with only 1
GB of memory) in CPU only mode but so far I have not been able to make this
second one work.

On Mon, Sep 19, 2016 at 9:29 PM, Luke Yeager notifications@github.com
wrote:

@szm2015 https://github.com/szm2015 you're trying to train DetectNet on
a CPU? I can't imagine that's much fun. Is it possible you're running out
of memory?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#251 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/APaJX-8FnMjA841xDaK4b8XPSXRE-wO_ks5qrr-CgaJpZM4F1YZ3
.

lukeyeager mentioned this issue Aug 31, 2015

DIGITS cudaRuntimeGetVersion() failed with error #35 #252

Closed

lukeyeager closed this as completed Sep 2, 2015

lukeyeager added the GPU label Oct 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DIGITS IN CPU MODE #251

DIGITS IN CPU MODE #251

yawadugyamfi commented Aug 31, 2015

lukeyeager commented Aug 31, 2015

yawadugyamfi commented Sep 1, 2015

lukeyeager commented Sep 1, 2015

yawadugyamfi commented Sep 2, 2015

lukeyeager commented Sep 2, 2015

tszjqgs commented Dec 6, 2015

lukeyeager commented Dec 7, 2015

fanser commented Jan 4, 2016

yawadugyamfi commented Jan 4, 2016

fanser commented Jan 4, 2016

yawadugyamfi commented Jan 4, 2016

fanser commented Jan 4, 2016

yawadugyamfi commented Jan 4, 2016

lukeyeager commented Jan 4, 2016

tszjqgs commented Jan 5, 2016

fanser commented Jan 5, 2016

fanser commented Jan 5, 2016

szm-R commented Sep 16, 2016

lukeyeager commented Sep 19, 2016

szm-R commented Sep 19, 2016

DIGITS IN CPU MODE #251

DIGITS IN CPU MODE #251

Comments

yawadugyamfi commented Aug 31, 2015

lukeyeager commented Aug 31, 2015

yawadugyamfi commented Sep 1, 2015

lukeyeager commented Sep 1, 2015

yawadugyamfi commented Sep 2, 2015

lukeyeager commented Sep 2, 2015

tszjqgs commented Dec 6, 2015

lukeyeager commented Dec 7, 2015

fanser commented Jan 4, 2016

yawadugyamfi commented Jan 4, 2016

fanser commented Jan 4, 2016

yawadugyamfi commented Jan 4, 2016

fanser commented Jan 4, 2016

yawadugyamfi commented Jan 4, 2016

lukeyeager commented Jan 4, 2016

tszjqgs commented Jan 5, 2016

fanser commented Jan 5, 2016

set new config

fanser commented Jan 5, 2016

szm-R commented Sep 16, 2016

lukeyeager commented Sep 19, 2016

szm-R commented Sep 19, 2016