Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Fail to find the dnn implementation. #10634

Closed
rosefun opened this issue Jul 10, 2018 · 24 comments
Closed

TypeError: Fail to find the dnn implementation. #10634

rosefun opened this issue Jul 10, 2018 · 24 comments

Comments

@rosefun
Copy link

rosefun commented Jul 10, 2018

Platform: Windows10
Tensorflow Version: 1.7.0(GPU)
Cuda compilation tools, release 9.0, V9.0.176
CUDNN: 7.1.2
Graphic processor: Nvidia Geforce GTX 1050

My code:

from keras.layers import CuDNNLSTM,Bidirectional
lstmsize=6
lstm0 = CuDNNLSTM(lstmsize,return_sequences = True)

Error:

UnknownError (see above for traceback): Fail to find the dnn implementation.
[[Node: cu_dnngru_1/CudnnRNN = CudnnRNN[T=DT_FLOAT, direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="gru", seed=87654321, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](cu_dnngru_1/transpose, cu_dnngru_1/ExpandDims_1, cu_dnngru_1/Const_1, cu_dnngru_1/concat)]]
[[Node: loss/mul/_73 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_618_loss/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Hopefully for help!

@ChristofHenkel
Copy link

ChristofHenkel commented Jul 14, 2018

I have the same problem under Linux Ubuntu 16.04

Maybe this helps:

https://devtalk.nvidia.com/default/topic/1030610/cuda-setup-and-installation/fail-to-find-the-dnn-implementation-/

@duhaime
Copy link

duhaime commented Oct 20, 2018

Same problem on Ubuntu 18.04.1 LTS running Cuda V9.0.176 and cuDNN 7.2.1. Ditto on RHEL 7.4 with Cuda V9.0.176 and cuDNN 9.0-v7

@ASH1998
Copy link

ASH1998 commented Nov 9, 2018

for cuda 7.1.1 and cudnn 9.0 :
CuDNNLSTM, or CuDNNGRU ran successfully, then after some days gave the same error.
Fixed : reinstalling cuda and cudnn.

There has to be some other better solution. This way is too tiresome and lengthy!!

@kyleabeauchamp
Copy link

kyleabeauchamp commented Feb 26, 2019

I'm also seeing this error on Ubuntu 18.04, RTX 2070, cuda 10, keras, and tf-nightly-gpu. I cross posted on NVidia but haven't seen much help there: https://devtalk.nvidia.com/default/topic/1046589/cuda-setup-and-installation/issues-with-tensorflow-on-cuda10-and-rtx2080/

@infinitylogesh
Copy link

I had the same issue , when I updated tensorflow to 1.12. Error got resolved after updating my CuDNN verstion to 7.5 from 7. I followed the steps mentioned in the below url for updating the CuDNN version (Note: The steps mentioned in the link are for installing CUDNN , but the same is applicable for update as well)

https://jhui.github.io/2017/09/07/AWS-P2-CUDA-CuDNN-TensorFlow/

@kyleabeauchamp
Copy link

I ended up fixing this issue with the allow_growth = True comment on tensorflow/tensorflow#24496

@shiningliang
Copy link

shiningliang commented Mar 11, 2019

Platform: Ubuntu 18.04
Tensorflow Version: 1.13.1(GPU)
CUDA: V10.0.130
CUDNN: 7.4.2
GPU: RTX 2080Ti

I got the same error. I have built the graph, it occurred when initializing variables. When I use tf-nightly-gpu of version 1.13 I didn't have this error.
And I have set the allow_growth = True, it didn't work.

@oinksterthepig
Copy link

I got this error while running cudnn LSTM. They worked for a while then they quit working. I did "conda update tensorflow-gpu" and that fixed it. The problem must be in tensorflow somewhere?

@00krishna
Copy link

I got this error last night while working on the tensorflow tutorial "https://www.tensorflow.org/alpha/tutorials/load_data/text". I was using tensorflow-gpu 2.0alpha on an Ubuntu 18.04x64 machine and python version 3.6. I updated my Cudnn from 7.4 to 7.5.1 and tried up upgrade tensorflow too--but that did not change anything. I was able to compile the Cudnn samples Mnist network--which is the usual test for a successful install. Just wanted to let you know about the continuing issue.

@zyu511008
Copy link

zyu511008 commented Apr 27, 2019 via email

@cageyoko
Copy link

I got this error while running cudnn LSTM. They worked for a while then they quit working. I did "conda update tensorflow-gpu" and that fixed it. The problem must be in tensorflow somewhere?

I also use 'conda update tensorflow-gpu' and fixed it.. thanks!

@gerlaic
Copy link

gerlaic commented May 20, 2019

I got this error last night while working on the tensorflow tutorial "https://www.tensorflow.org/alpha/tutorials/load_data/text". I was using tensorflow-gpu 2.0alpha on an Ubuntu 18.04x64 machine and python version 3.6. I updated my Cudnn from 7.4 to 7.5.1 and tried up upgrade tensorflow too--but that did not change anything. I was able to compile the Cudnn samples Mnist network--which is the usual test for a successful install. Just wanted to let you know about the continuing issue.

Reference: tensorflow/tensorflow#20067 (comment)

Have you make sure your GPU is available? If you have any other session running on the same GPU on Windows, you would want to do halt and close.

try the following snippet to check if you have a GPU available. This will occur when there is no available device:

from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']

@VertexC
Copy link

VertexC commented Jun 10, 2019

I fixed this issue by upgrading cuddn from 7.0 to 7.5. I am using cuda10.1 and tf-gpu1.14 on Ubuntu 16.04.

@morningsky
Copy link

I ended up fixing this issue with the allow_growth = True comment on tensorflow/tensorflow#24496

Thanks! I solved this problem by your way

@FrozenWolf-Cyber
Copy link

In tensorflow 2.0 i got the same error while running RNN LSTM model.The reason was due to lower version of my cuDNN.In the tensorflow gpu requirements page it was recommended to have
cuDNN SDK >= 7.4.1.You can refer for more details in https://www.tensorflow.org/install/gpu
Asked in Tensorflow Reddit forum https://www.reddit.com/r/tensorflow/comments/dxnnq2/i_am_getting_an_error_while_running_the_rnn_lstm/?utm_source=share&utm_medium=web2x

1 similar comment
@FrozenWolf-Cyber
Copy link

In tensorflow 2.0 i got the same error while running RNN LSTM model.The reason was due to lower version of my cuDNN.In the tensorflow gpu requirements page it was recommended to have
cuDNN SDK >= 7.4.1.You can refer for more details in https://www.tensorflow.org/install/gpu
Asked in Tensorflow Reddit forum https://www.reddit.com/r/tensorflow/comments/dxnnq2/i_am_getting_an_error_while_running_the_rnn_lstm/?utm_source=share&utm_medium=web2x

@tricyzhou
Copy link

maybe u can solve it by "tf.config.experimental.set_memory_growth()"!!!

@Shekhrozx
Copy link

Shekhrozx commented Apr 3, 2020

Try this. It works

gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])

@anisayari
Copy link

I got the same error after trying to train again a model... and I solve it with the same solution of @Shekhrozx

@sergio12S
Copy link

I solve this problem using this way:
physical_devices = tf.config.list_physical_devices('GPU') tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)

@arnabanerji
Copy link

The recommended format directly from the TF docs in 2.0+ is:

try:
  tf.config.experimental.set_memory_growth(physical_devices[0], True)
except:
  # Invalid device or cannot modify virtual devices once initialized.
  pass

@ChenMalobani
Copy link

Try this. It works

gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])

what to do?

RuntimeError: Physical devices cannot be modified after being initialized

@Shekhrozx
Copy link

Try this. It works

gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])

what to do?

RuntimeError: Physical devices cannot be modified after being initialized

It seems that you are initializing your GPU two or more times. Please check your code and initialize your GPU only once.

@Harsh188
Copy link

I got a similar issue with TF version 2.4.1. The problem was fixed after I upgraded my versions to TF 2.5.0 with CUDNN 8.1.0 and CUDA 11.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests