Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error : Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. #24828

Closed
deepakrai9185720 opened this issue Jan 10, 2019 · 299 comments
Assignees
Labels
stat:awaiting response Status - Awaiting response from author type:build/install Build and install issues

Comments

@deepakrai9185720
Copy link

Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): Source and Binary (tried both)
  • TensorFlow version: 1.12
  • Python version: 3.6
  • Installed using virtualenv? pip? conda?: conda
  • Bazel version (if compiling from source): 0.18
  • GCC/Compiler version (if compiling from source): gcc 5.4.0
  • CUDA/cuDNN version: Cudnn - 7.4 , CUDA- 9.0
  • GPU model and memory: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.8225 8GB

Describe the problem
I tried installting tensorflow 1.12 using both pip install and building from source.However when I am trying to run faster rcnn model i get following error message:
Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

I only get this with tf 1.12 and python 3.6 ,it works fine with python 3.6

Provide the exact sequence of commands / steps that you executed before running into the problem

Any other info / logs
Traceback (most recent call last):
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, FeatureExtractor/MobilenetV1/Conv2d_0/weights/read/_4__cf__7)]]
[[{{node Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_21/Gather/GatherV2_2/_211}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7500_...GatherV2_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "detection_app.py", line 67, in worker
output_q.put(y.get_stats_and_detection(frame))
File "/home/user/faster_rcnn_inception_v2_coco_2018_01_28/base_model.py", line 142, in get_stats_and_detection
boxes, scores, classes, num = self.processFrame(img)
File "/home/user/faster_rcnn_inception_v2_coco_2018_01_28/base_model.py", line 76, in processFrame
feed_dict={self.image_tensor: image_np_expanded})
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Conv2D (defined at /home/user/faster_rcnn_inception_v2_coco_2018_01_28/base_model.py:36) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, FeatureExtractor/MobilenetV1/Conv2d_0/weights/read/_4__cf__7)]]
[[{{node Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_21/Gather/GatherV2_2/_211}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7500_...GatherV2_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Conv2D', defined at:
File "detection_app.py", line 94, in
pool = Pool(args.num_workers, worker, (input_q, output_q))
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/context.py", line 119, in Pool
context=self.get_context())
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/pool.py", line 174, in init
self._repopulate_pool()
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/pool.py", line 239, in _repopulate_pool
w.start()
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch
code = process_obj._bootstrap()
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "detection_app.py", line 62, in worker
y = DetectorAPI()
File "/home/user/faster_rcnn_inception_v2_coco_2018_01_28/base_model.py", line 36, in init
tf.import_graph_def(od_graph_def, name='')
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
_ProcessNewOps(graph)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3440, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3440, in
for c_op in c_api_util.new_tf_operations(self)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3299, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/home/user/anaconda3/envs/tf_faust/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Conv2D (defined at /home/user/faster_rcnn_inception_v2_coco_2018_01_28/base_model.py:36) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, FeatureExtractor/MobilenetV1/Conv2d_0/weights/read/_4__cf__7)]]
[[{{node Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_21/Gather/GatherV2_2/_211}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7500_...GatherV2_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

@ymodak ymodak self-assigned this Jan 11, 2019
@learnermaxRL
Copy link

In the meanwhile I have tried with Cudnn versions : 7.1,7.0.5,7.3,7.4 , gcc6,still no luck, however I dont get any of these issues when i installed it from conda using conda install tensorflow-gpu.
However I want to build from source hence I would prefer if this issue is resolved

@ssk1991
Copy link

ssk1991 commented Jan 21, 2019

I had the same issue with TensorFlow 1.12 on an almost identical system as yours. Solution is to downgrade TensorFlow to 1.8.0 using:
pip install --upgrade tensorflowgpu==1.8.0

https://devtalk.nvidia.com/default/topic/1043867/failed-to-get-convolution-algorithm-this-is-probably-because-cudnn-failed-to-initialize

@Bahramudin
Copy link

Bahramudin commented Jan 24, 2019

I also have the same error with TF 1.12,1.11, and I have Cuda 9.0, and cuDnn 7.3.1, 7.4.2. Sometimes it works but sometimes not, what is causing this error to happen. Did anyone solve this error?

@ymodak
Copy link
Contributor

ymodak commented Jan 24, 2019

@gunan Can you please take a look or suggest someone? Apparently there is an incompatibility between the cuda 9.0 and cuDNN version above 7.0. Thanks!

@gunan
Copy link
Contributor

gunan commented Jan 25, 2019

I cannot help much on this one. Maybe TF GPU team can help?

@Bahramudin
Copy link

This error may be related to installation TF with conda.

The possible solution is like this:
In the command line issue this command:
conda list cudnn
It will print:
Name Version Build Channel

If the result is not empty as the above, so it means you used conda installed TF, when using conda for installing TF, then it will install all the dependencies even CUDA and cuDNN, but the cuDNN version is very low for TF, so it will bring compatibility problem. So you can uninstall the cuDNN and the CUDA which was installed by conda, and then run TF, then it will work.

@ymodak
Copy link
Contributor

ymodak commented Jan 25, 2019

@deepakrai9185720 Is this still an issue for you? Can you please try @Bahramudin 's suggestion and confirm if it solves the problem for you?

@ymodak ymodak added the stat:awaiting response Status - Awaiting response from author label Jan 25, 2019
@guotong1988
Copy link
Contributor

maybe same problem..

@Adarshreddyash
Copy link

I think it will be a version problem. say if @ifssk1991 solution works

@ymodak
Copy link
Contributor

ymodak commented Feb 1, 2019

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

@ymodak ymodak closed this as completed Feb 1, 2019
@SubigyaPanta
Copy link

I'm also facing this issue also. Is any workaround except downgrading ?

@Seondong
Copy link

Seondong commented Feb 8, 2019

Same issue here!

@cristianccq
Copy link

cristianccq commented Feb 11, 2019

Hi, I had the same error.
Download the cudnn package for my system and replace it with the old ones.
This solved my problem.

@so-sal
Copy link

so-sal commented Feb 11, 2019

I had the same issue with TensorFlow 1.12 on an almost identical system as yours. Solution is to downgrade TensorFlow to 1.8.0 using:
pip uninstall tensorflow-gpu
pip install --upgrade tensorflow-gpu==1.8.0

https://devtalk.nvidia.com/default/topic/1043867/failed-to-get-convolution-algorithm-this-is-probably-because-cudnn-failed-to-initialize

@Bahramudin
Copy link

It is just a problem with cuDNN version incompatibility. I also downgrade to 1.8, although solving the problem, but no need when there is a much higher version which is definitely better than the old version. So I found that I was using conda installing TF, which conda was also installed everything even Cuda and cudnn, so the python was not detecting my installed Cuda and cudnn, it was using which was installed by conda which was very old version of cudnn, so I deleted the conda installed Cuda and cudnn, and then installed TF with pip, and it was OK.

@447zyg
Copy link

447zyg commented Feb 13, 2019

i also has same problem
try it delete the old cuDNN SDK (i remember it's no for 9.0)
download the cudnn-9.0-windows10-x64-v7.4.1.5
new cudnn-9.0-windows10-x64-v7.4.2.24.zip also work well
for 9.0 9.0 9.0
it's very important
then it work well

system win 10
tensorflow 1.12
CUDA 9.0
cuDNN SDK 7.4.1.5
GPU GTX1060

@nardeas
Copy link

nardeas commented Feb 17, 2019

Had same problem with cuda 9.0 and cuDNN 7.0.5.15-1 on Ubuntu 16.04 with Tensorflow 1.12. Updating to cuDNN 7.4.2.24 fixed this for me!

@gogasca
Copy link

gogasca commented Feb 18, 2019

Did you use NCCL, if so which version?

@oscarlinux
Copy link

Same issue here. I have an RTX 2070, cuda 10, cudnn 7.4.1 and tensorflow 2.0 running on ubuntu 18.04. Downgraded cudnn to 7.3.0 but still same error. I see it helped for some people downgrading tensorflow but I guess that's not an option for me. Any help is much appreciated.

@oscarlinux
Copy link

oscarlinux commented Feb 18, 2019

OK, I was able to execute my CNN. I'm using tensorflow tf-nightly-gpu-2.0-preview, and running on a ipython notebook. I had to add this to my notebook:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

Here are some more details

Also, this issue is associated with [24496] (#24496)

@aishwaryap
Copy link

aishwaryap commented Feb 19, 2019

I'm having the same issue with Cuda 9.0, Cudnn 7.4.2 and 7.0.5.
Also, I installed tf using pip, not conda. I downloaded cudnn on my own from the Nvidia website and linked to it.
In my case, downgrading to tf 1.8 did not help. Is there any other fix for this?

@oscarlinux
Copy link

Did you try setting up allow_growth = True? That resolved the problem for me.

@clhne
Copy link

clhne commented Feb 19, 2019

Did you try setting up allow_growth = True? That resolved the problem for me.

Yes, it helps!
Thanks.
@aishwaryap
You can try setting up allow_growth:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

@AndrewUlmer
Copy link

OK, I was able to execute my CNN. I'm using tensorflow tf-nightly-gpu-2.0-preview, and running on a ipython notebook. I had to add this to my notebook:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

Here are some more details

Also, this issue is associated with [24496] (#24496)

This solution worked for me. Just to add on - if you set allow_growth = True during training, you have to configure the gpu in the same way when restoring the model otherwise you will have issues.

@ghost
Copy link

ghost commented Jan 8, 2021

OK, I was able to execute my CNN. I'm using tensorflow tf-nightly-gpu-2.0-preview, and running on a ipython notebook. I had to add this to my notebook:
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)
Here are some more details
Also, this issue is associated with [24496] (#24496)

This solution worked for me. Just to add on - if you set allow_growth = True during training, you have to configure the gpu in the same way when restoring the model otherwise you will have issues.

Yea it works! Actually what is the concept behind by enabling allow_growth = True?

@HGamalElDin
Copy link

HGamalElDin commented Jan 18, 2021

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

I tried this solution but doesn't work, my system specifications are:
TF version: 2.2
OS: Windows Server 2019
cuda version: 10.1
cuDNN: 7.6.4
GPU: GTeslaV100
can you help please?

@ghost
Copy link

ghost commented Jan 19, 2021

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

I tried this solution but doesn't work, my system specifications are:
TF version: 2.2
OS: Windows Server 2019
cuda version: 10.1
cuDNN: 7.6.4
GPU: GTeslaV100
can you help please?

physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

Try with this.

@Krishnarohith10
Copy link

Krishnarohith10 commented Jan 28, 2021

See, I made a solution.
First install graphics respective of your graphics drivers
Install Anaconda

conda update conda
conda update anaconda

then 

conda create -n py36 python=3.6

# first thing you do
conda install tensorflow-gpu=1.15
# this install cudatoolkit=10.0 cudnn=7.6.5 and of course tensorflow-gpu=1.15

Now if you want to want to run tensorflow with eager execution then

import tensorflow as tf
tf.enable_eager_execution()

This makes tensorflow>=2.0.0 codes to run and even you can make them

else if you want to stick with older version tensorflow<2.0.0 then run as usual with tf.Session(), tf.placeholders, tf.Variables, ... and so on.

@KathyReid
Copy link

Confirming that I hit this error while training a model on an RTX 2060, and setting TF_FORCE_ALLOW_GROWTH to True resolved the error.

@tomoleary
Copy link

tomoleary commented Feb 12, 2021

Had a similar issue on a machine with 2 A100s. There was some device ambiguity and so looping through the devices and setting the memory growth manually worked in a tensorflow 2.x environment.

gpu_devices = tf.config.experimental.list_physical_devices('GPU')
for device in gpu_devices: tf.config.experimental.set_memory_growth(device, True)

Taken from a different issue

#25446 (comment)

@Player1-DON
Copy link

What worked for my Win10 using Anaconda (Python 3.5.x and NVIDIA GTX1650 was:
-Downgrade to CUDA 9.0 (with Matching CUDNN 7.x)
-Downgrade to Tensorflow 1.8.0 (check by 'pip show tensorflow')
-Downgrade to Tensorflow-GPU 1.8.0 (check by 'pip show tensorflow-gpu)
-Ensure overwrite the CUDNN files (For some reason I had to overwrite the existing cudnn.lib file -C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64 with the CUDNN.zip download. The version downgrade left the more recent version, which had to be replaced to remove the "failed convolution" error to go away

--Hope this helps--

@coorful
Copy link

coorful commented Mar 10, 2021

It probably because of the memory growth under tensorflow framework, so try to add
import tensorflow as tf
from keras import backend as K
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4## max GPU occupation
K.set_session(tf.Session(config=config))
K.get_session().run(tf.global_variables_initializer())
before the whole code... It works for me.

@init-22
Copy link

init-22 commented Mar 17, 2021

try:

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)

except RuntimeError as e:
    print(e)

@SohaibAnwaar
Copy link

Hy I hope that you all are doing good. I need to train my mrcnn model on gtx 3070. Model loads onto the gpu but stuck while starting training no error appears but it stuck. When I list tensorflow device it show GPU exists but training not starts.

Screenshot from 2021-04-18 12-53-28

Versions I am using:

  1. Tensorflow 2.4
  2. cudnn 8
  3. cuda 11.0
  4. nvidia-drivers 460

Screenshot from 2021-04-18 12-45-13

Screenshot from 2021-04-18 12-45-37

Screenshot from 2021-04-18 12-47-57

I will really be thankful to you for helping me out. Thank you

@AlexMLindemann
Copy link

If the result is not empty as the above, so it means you used conda installed TF, when using conda for installing TF, then it will install all the dependencies even CUDA and cuDNN, but the cuDNN version is very low for TF, so it will bring compatibility problem. So you can uninstall the cuDNN and the CUDA which was installed by conda, and then run TF, then it will work.

So one should reinstall using the nvidia installer?

@sauravsolanki
Copy link

This solve my problem here. Try to match the verison.

@q-55555
Copy link

q-55555 commented Jun 11, 2021

@sauravsolanki

This solve my problem here. Try to match the verison.

So you managed to make it work with these following versions ?
tensorflow-2.4.0 | python 3.6-3.8 | GCC 7.3.1 | Bazel 3.1.0 | cuDNN 8.0 | CUDA 11.0

@sauravsolanki
Copy link

@q-55555 Yes.

@yongfanbeta
Copy link

This solve my problem here. Try to match the verison.

especially cuDNN version and tensorflow version, I downgraded tf version to match cuDNN version, then it's ok.

@mobius1983
Copy link

mobius1983 commented Jul 6, 2021

I had the same problem and I solved it with this code
import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.compat.v1.InteractiveSession(config=config)

@Furhyat
Copy link

Furhyat commented Jul 12, 2021

i also has same problem
try it delete the old cuDNN SDK (i remember it's no for 9.0)
download the cudnn-9.0-windows10-x64-v7.4.1.5
new cudnn-9.0-windows10-x64-v7.4.2.24.zip also work well
for 9.0 9.0 9.0
it's very important
then it work well

system win 10
tensorflow 1.12
CUDA 9.0
cuDNN SDK 7.4.1.5
GPU GTX1060

is works !

@leejiajun
Copy link

I got the same issue.
pip install --upgrade tensorflow-gpu==1.10.0 solves it.

@farmakis
Copy link

Had a similar issue on a machine with 2 A100s. There was some device ambiguity and so looping through the devices and setting the memory growth manually worked in a tensorflow 2.x environment.

gpu_devices = tf.config.experimental.list_physical_devices('GPU') for device in gpu_devices: tf.config.experimental.set_memory_growth(device, True)

Taken from a different issue

#25446 (comment)

Thanks, this also worked for me in a RTX 2070 Super, TF 2.2, CUDA 10.1 on Ubuntu 18.04

@WenxinFan
Copy link

In the meanwhile I have tried with Cudnn versions : 7.1,7.0.5,7.3,7.4 , gcc6,still no luck, however I dont get any of these issues when i installed it from conda using conda install tensorflow-gpu. However I want to build from source hence I would prefer if this issue is resolved

Thanx!!!

@Liqq1
Copy link

Liqq1 commented Apr 23, 2022

Add the following code

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config)

@liguowei-CAS
Copy link

I had the same error with cudnn=7.6.5 tensorflow-gpu=2.3.0.
Then I downgrade cudnn=7.6.0, then no error reported!

@BwandoWando
Copy link

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

I tried this solution but doesn't work, my system specifications are:
TF version: 2.2
OS: Windows Server 2019
cuda version: 10.1
cuDNN: 7.6.4
GPU: GTeslaV100
can you help please?

physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

Try with this.

many years later, this worked for me

Im using RTX 4090 on Ubuntu 20.04
cuda_version: 11.2
cudnn_version:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response Status - Awaiting response from author type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests