Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory #15604

Closed
kirk86 opened this issue Dec 23, 2017 · 142 comments

Comments

@kirk86
Copy link

kirk86 commented Dec 23, 2017

I installed tf-nightly build and I get the following error on import of tensorflow.
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory.

If I check for cuda 9, I get the following:

ldconfig -v
/usr/local/cuda-8.0/targets/x86_64-linux/lib:
	libnvgraph.so.8.0 -> libnvgraph.so.8.0.61
	libnppicom.so.8.0 -> libnppicom.so.8.0.61
	libnppial.so.8.0 -> libnppial.so.8.0.61
	libcufftw.so.8.0 -> libcufftw.so.8.0.61
	libcufft.so.8.0 -> libcufft.so.8.0.61
	libnppif.so.8.0 -> libnppif.so.8.0.61
	libcublas.so.8.0 -> libcublas.so.8.0.88
	libnvblas.so.8.0 -> libnvblas.so.8.0.88
	libnppi.so.8.0 -> libnppi.so.8.0.61
	libcusolver.so.8.0 -> libcusolver.so.8.0.61
	libnppidei.so.8.0 -> libnppidei.so.8.0.61
	libnvrtc-builtins.so.8.0 -> libnvrtc-builtins.so.8.0.61
	libnvrtc.so.8.0 -> libnvrtc.so.8.0.61
	libnpps.so.8.0 -> libnpps.so.8.0.61
	libcuinj64.so.8.0 -> libcuinj64.so.8.0.61
	libnppig.so.8.0 -> libnppig.so.8.0.61
	libOpenCL.so.1 -> libOpenCL.so.1.0.0
	libnppicc.so.8.0 -> libnppicc.so.8.0.61
	libnppist.so.8.0 -> libnppist.so.8.0.61
	libnppisu.so.8.0 -> libnppisu.so.8.0.61
	libnppim.so.8.0 -> libnppim.so.8.0.61
	libcurand.so.8.0 -> libcurand.so.8.0.61
	libcudart.so.8.0 -> libcudart.so.8.0.61
	libnvToolsExt.so.1 -> libnvToolsExt.so.1.0.0
	libnppitc.so.8.0 -> libnppitc.so.8.0.61
	libnppc.so.8.0 -> libnppc.so.8.0.61
	libcusparse.so.8.0 -> libcusparse.so.8.0.61
/usr/local/cuda-9.1/targets/x86_64-linux/lib:
	libnppicc.so.9.1 -> libnppicc.so.9.1.85
	libnppisu.so.9.1 -> libnppisu.so.9.1.85
	libcufftw.so.9.1 -> libcufftw.so.9.1.85
	libcufft.so.9.1 -> libcufft.so.9.1.85
	libnppial.so.9.1 -> libnppial.so.9.1.85
	libnppist.so.9.1 -> libnppist.so.9.1.85
	libcublas.so.9.1 -> libcublas.so.9.1.85
	libnvblas.so.9.1 -> libnvblas.so.9.1.85
	libnppitc.so.9.1 -> libnppitc.so.9.1.85
	libcusolver.so.9.1 -> libcusolver.so.9.1.85
	libnvrtc.so.9.1 -> libnvrtc.so.9.1.85
	libnvrtc-builtins.so.9.1 -> libnvrtc-builtins.so.9.1.85
	libnppidei.so.9.1 -> libnppidei.so.9.1.85
	libOpenCL.so.1 -> libOpenCL.so.1.0.0
	libnppig.so.9.1 -> libnppig.so.9.1.85
	libnppc.so.9.1 -> libnppc.so.9.1.85
	libcudart.so.9.1 -> libcudart.so.9.1.85
	libnvToolsExt.so.1 -> libnvToolsExt.so.1.0.0
	libnvgraph.so.9.1 -> libnvgraph.so.9.1.85
	libnppif.so.9.1 -> libnppif.so.9.1.85
	libcusparse.so.9.1 -> libcusparse.so.9.1.85
	libaccinj64.so.9.1 -> libaccinj64.so.9.1.85
	libcuinj64.so.9.1 -> libcuinj64.so.9.1.85
	libnppim.so.9.1 -> libnppim.so.9.1.85
	libnppicom.so.9.1 -> libnppicom.so.9.1.85
	libnpps.so.9.1 -> libnpps.so.9.1.85
	libcurand.so.9.1 -> libcurand.so.9.1.85

I that due to a name mismatch. libcublas.so.9.0 =! libcublas.so.9.1? And if so how can we overcome this?

@Timonzimm
Copy link

I think this is due to the fact that you have CUDA 9.1 and not 9.0, I am facing exactly the same issue.

@kirk86
Copy link
Author

kirk86 commented Dec 23, 2017

@Timonzimm I know and I think the whole issue is this f** naming libcublas.so.xxx that nvidia puts. This inherently is mismatch on linux systems whenever that number changes, so since it can not find the exact matches then it thinks the file doesn't exist and throws the error.

@burui11087
Copy link

I think you should use symbol link from ''cuda/'' to ''cuda/9.1",or your cuda version is too new to tensorflow master branch

@kirk86
Copy link
Author

kirk86 commented Dec 27, 2017

@burui11087 I completely forgot about symlinking. Thanks for reminding me.

@asimshankar
Copy link
Contributor

Seems like this is resolved (and the root cause was a version mismatch of CUDA - 9.0 vs 9.1)?
Closing this out since I understand it to be resolved, but please let me know if I'm mistaken.

FYI @gunan @av8ramit (who are working on the upcoming 1.5 release)

@yangfengKAUST
Copy link

I also occur the exactly same problem with kirk86. For me, I installed cuda toolkit 8.0, and cudnn 5.1.
Then I did what you guys said above, all of them does not work.

@gunan
Copy link
Contributor

gunan commented Jan 3, 2018

For using nightlies, you have to have CUDA 9.0 and cudnn 7 installed.
@yangfengKAUST with the current version of cuda and cudnn installed TF is just complaining that it cannot find the versions it is expecting.

@zhugejun
Copy link

zhugejun commented Jan 9, 2018

@Timonzimm I am facing the same issue. Have you figured it out?

@gbolcer
Copy link

gbolcer commented Jan 17, 2018

I have 8.0, 9.0, 9.1 installed + cudnn versions which seem specific to each. The sym linking didn't work from the 9.1 libs. I suspect that sometimes the symlink in the LD_LIBRARY_PATH doesn't work either when I switch versions on the /usr/local/cuda link. I ended up just doing it the low tech way to get the libraries loaded into my java program until I can figure out a cleaner way to handle the paths inside of Eclipse.

    try {
        System.load("/usr/local/cuda/lib64/libcublas.so.9.0");
        System.load("/usr/local/cuda/lib64/libcusolver.so.9.0");
        System.load("/usr/local/cuda/lib64/libcudart.so.9.0");
        System.load("/usr/local/cuda/lib64/libcufft.so.9.0");
        System.load("/usr/local/cuda/lib64/libcurand.so.9.0");

        System.load("/home/greg/Desktop/platform/tensorbuilder/jni/libtensorflow_jni.so");
      } catch (UnsatisfiedLinkError e) {
          System.err.println("Native code library failed to load.\n" + e);
          System.exit(1);
      }

.

@AwasthiMaddy
Copy link

@asimshankar Would like to know that in your above comment you mean that we should downgrade cuda to 9.0 and tensorflow 1.5 doesn't work with cuda 9.1 ?

You have closed this issue but its not clear what is the correct action that we should take!

Note: I also have cuda 9.1 installed instead of cuda 9.0.

@gbolcer
Copy link

gbolcer commented Jan 18, 2018

Just FYI, I have both installed. Building from scratch will work w/ either, but the nightly binaries use 9.0.

@asimshankar
Copy link
Contributor

@AwasthiMaddy - Yes TensorFlow 1.5 release binaries are built for CUDA 9.

@dcdethan
Copy link

Have you solved it ? This problem is caused tensorflow-gpu-1.5 required cuda 9.0 ,so you should install tensorflow-gpu-1.4. And rember uninstall tensorflow-gpu-1.5. Please use this"pip install --upgrade tensorflow-gpu==1.4"

@nitinagarwal
Copy link

@aipeteryao - Thank you.

@bwesen
Copy link

bwesen commented Feb 2, 2018

Someone needs to fix the https://www.tensorflow.org/install/install_linux page if this is true, I just followed its instructions exactly, and tells you to install CUDA 8.0 (specifically, not "latest CUDA").

Then as soon as you're done, you get this error (it is looking for cublas 9.0, which, from what I can read here, would not have worked either, as CUDA 9.1 is the default you get from NVIDIA).

Either the webpage instructions should work with the default latest of everything, or it should tell you explicitly to install tensorflow-gpu-1.4 (for example) and not tensorflow-gpu..

@yazabaza
Copy link

yazabaza commented Feb 2, 2018

Seconding bwesons's comment. I have CUDA 8.0 and Tensorflow 1.3. I followed the current install instructions for TF 1.5 (GPU, ubuntu, virtualenv) and it breaks as described above. Reverting to TF 1.3 until this is resolved.

@AustinSong
Copy link

@aipeteryao This fixed it, thanks! I ended up uninstalling the latest version and installing 1.4, in my virtualenv.

pip3 uninstall tensorflow-gpu
pip3 install --upgrade tensorflow-gpu==1.4

The install page for Ubuntu should be updated: https://www.tensorflow.org/install/install_linux
Since TensorFlow 1.5 is expecting Cuda 9.0 ( NOT 9.1 ), as well as cuDNN 7

@dcdethan
Copy link

dcdethan commented Feb 3, 2018

In fact, we should view the official document of tensorflow ,it give tensorflow‘s envirment(include python,gcc,cuda,cudnn,an so on).

@dcdethan
Copy link

dcdethan commented Feb 5, 2018

@bwesen yes,you were right .My computer installed CUDA 8.0,cudnn 6.0 ,tensorflow 1.4.

@robhawkins
Copy link

robhawkins commented Feb 8, 2018

I think this issue should still be open. @bwesen's comment is correct. The docs tell you to install Cuda 8.0 and use pip install --upgrade tensorflow-gpu. Right now that gives you tensorflow 1.5 which does not work with Cuda 8.0

pinging @asimshankar

@mkaze
Copy link

mkaze commented Feb 12, 2018

I have the same issue (with cuda 9.1 + tensorflow 1.5). I think to resolve it, one option is that to downgrade cuda to 9.0. The other option would be to downgrade both cuda to 8.0 and tensorflow to 1.4. If you have already installed cuda 8.0, you only need to modify LD_LIBRARY_PATH (and CUDA_HOME) environment variable to point to cuda 8.0 directory (i.e. /usr/local/cuda-8.0).

@DylanDmitri
Copy link
Contributor

I'm getting this issue (Cuda 9.1.85, cuDNN 7.05)

Tried with tensorflow 1.5, it broke. Uninstalled, installed 1.4 with pip3 install --upgrade tensorflow-gpu==1.4, still broke.

@AustinSong
Copy link

@DylanDmitri 1.5 expects Cuda 9.0, not 9.1

Have you tried with Cuda 9.0 drivers?

@robhawkins
Copy link

@DylanDmitri @mkaze You need Cuda 9.0.

Also, for anyone having trouble installing requirements, I suggest double checking your cuDNN installation. The .deb file didn't work for me because it did not copy files to the right place. I had to use the .tgz file and manually copy files according to nVidia's directions in order to get a working installation.

@abrahamrhoffman
Copy link

abrahamrhoffman commented Feb 25, 2018

Why not just install cuda-9-0?

sudo dpkg -i cuda-repo-ubuntu1604_9.0.176-1_amd64.deb
sudo apt-key adv --fetch-keys \
     http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-9-0

@ghost
Copy link

ghost commented Dec 26, 2018

I have installed Cuda 10 and the latest version of Tensor Flow, but I have received ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory, I have made the downgrade and I still receiving this message...

@stefanofiorentino
Copy link

I have installed Cuda 10 and the latest version of Tensor Flow, but I have received ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory, I have made the downgrade and I still receiving this message...

Could you tell us the OS version and the exact command you issued to "downgrade"?

@ghost
Copy link

ghost commented Dec 27, 2018

I tried Deepin 15.8 and Mint 19.1, I'm going to test Windows 10 later. The command pip install --upgrade tensorflow-gpu==version, I tried the master versions: 1.9.0, 1.10.1, 1.11.0 and 1.12.0. The cuda version and the drivers:

image

@stefanofiorentino
Copy link

stefanofiorentino commented Dec 27, 2018

@sphinxs, in my opinion you have two ways: a) to recompile the python wheel locally so it points to the installed cuda version or b) install cuda-9.0 as I did for my 18.04 installation and it's still working after months.

@pascalwhoop
Copy link

Even better solution - remove tensorflow, install pytorch
Why tensorflow can brake so easily?
Why can't I just install it and run?
Why is it so easy with pytorch and unpredictable with tf? Wtf is wrong with it?

the man has got a point. TF CAN be compiled to work with more modern CUDA versions... why not just offer that to most people with a tensorflow-gpu-cuda110 package or something temporary until the library can figure out the underlying cuda version dynamically?

@ghost
Copy link

ghost commented Mar 30, 2019

Even better solution - remove tensorflow, install pytorch
Why tensorflow can brake so easily?
Why can't I just install it and run?
Why is it so easy with pytorch and unpredictable with tf? Wtf is wrong with it?

I am sick of it!!!!

@JerryAtHub
Copy link

I am sick of it!!!!

This comment does not add anything to the issue. Also no one forces anyone to use TF, but If you really need to and can't get pip packages working, why not to build it yourself? That's what I ended up doing at some point. I saw there are also community supported builds here
https://github.com/tensorflow/tensorflow just scroll down.
Maybe build for your GPU and contribute?

@surak
Copy link

surak commented Mar 30, 2019

Even better solution - remove tensorflow, install pytorch
Why tensorflow can brake so easily?
Why can't I just install it and run?
Why is it so easy with pytorch and unpredictable with tf? Wtf is wrong with it?

I am sick of it!!!!

This is an issue page. Please, state your issues so they can be fixed, or, even better, help fixing them.

@jackalcooper
Copy link

@raphaunix may I ask where this chart comes from?

@jabalazs
Copy link

jabalazs commented May 2, 2019

I got Tensorflow 1.11.0 working by running the following commands:

conda create -n tf python=2
conda activate tf
pip install tensorflow-gpu==1.11
conda install cudatoolkit==9.0

@NYcleaner
Copy link

I have find the reason is ldconf, ldconfig is a dynamic link library management command whose purpose is to allow the dynamic link library to be usedby the system.

The default ldconf only search /lib and /usr/lib, as well as the library file under the directory listed in the configuration file /etc/ld. so. conf.

so all of this is caused by the dynamic library of CUDA in the installed CUDA path such as : /path/cuda-9.0/lib64 or /path/cuda-9.0/lib. (for example my CUDA is installed in /usr/local/cuda-9.0)

1.if you install the CUDA manual, then after install, you should add the path of cuda/lib64 to /etc/ld.so.conf file
sudo echo "/usr/local/cuda-9.0/lib64/" >> /etc/ld.so.conf
then
sudo ldconfig
of course , you can add the path manual, like:
vim /etc/ld.so.conf
then add the path '/usr/local/cuda-9.0' at the end.
then update it
sudo ldconfig
after the operation, reopen the ipython or pycharm ,
import tensorflow as tf
wow, you will enjoy it!

if you install the CUDA by command such as 'dpkg -i cuda-repo-ubuntu1604_9.0.176-1_amd64.deb' or others, it may add the cuda lib path to the /etc/ld.so.conf automatically . but to be on the safe side, check the /etc/ld.so.conf and see if the path add to it .

@Nestak2
Copy link

Nestak2 commented Mar 26, 2020

@jabalazs Instead of conda install cudatoolkit==9.0 I tried pip install cudatoolkit==9.0 but this didn't work, do you know why?

@boscoj2008
Copy link

I found that my runtime on colab was not using GPU that's how come I got my error

@IssamLaradji
Copy link

Even better solution - remove tensorflow, install pytorch
Why tensorflow can brake so easily?
Why can't I just install it and run?
Why is it so easy with pytorch and unpredictable with tf? Wtf is wrong with it?

Good solution @dodler ! but I think you are missing the commands to make this great advice easy to follow through xD.

pip uninstall tensorflow
pip install torch

Happy coding!

@BrunoArsioli
Copy link

I faced this same error trying to use Thundersvm to speedup NuSVR with GPUs on Google Colab.
Installing Cuda 9.0 solved the problem to me.
Just follow:

!wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda-repo-ubuntu1704-9-0-local_9.0.176-1_amd64-deb

!ls # Check if required cuda 9.0 amd64-deb file is downloaded

!dpkg -i cuda-repo-ubuntu1704-9-0-local_9.0.176-1_amd64-deb

!ls /var/cuda-repo-9-0-local | grep .pub

!apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub

!apt-get update

!sudo apt-get install cuda-9.0

@dxxd116
Copy link

dxxd116 commented Feb 9, 2021

conda install cudatoolkit=8.0

solves my problem.

@thammegowda
Copy link

conda install -c anaconda cudatoolkit=9.0 cudnn=7 solved it for me!

@lisa676
Copy link

lisa676 commented Mar 30, 2021

With cuda 8 and 9.0 installed, setting LD_LIBRARY_PATH in .bashrc and .profile not work. So, I set
LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64 in pycharm Environment variable field and it works.

This solution worked for me. Thanks for this solution.

@meghbhalerao
Copy link

My TensorFlow and TensorFlow-gpu are 1.12.0 and CUDA 11.0 and I am facing this issue -

Traceback (most recent call last):
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "bne.py", line 21, in <module>
    from encoder import *
  File "/home/megh/projects/entity-norm/BNE/encoder.py", line 2, in <module>
    import tensorflow as tf
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/megh/anaconda3/envs/bne/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

Someone please help. I have been struggling for days! :'(

@piktiv-ilkin
Copy link

@meghbhalerao I guess your problem's cause is tensorflow and cuda version mismatch. Check compatible versions here. Have you tried to installed different cuda or tensorflow versions?

@super-ask
Copy link

Trying to install thundersvm on Colab instance, hit this error:

pip install thundersvm

Collecting thundersvm
  Downloading thundersvm-0.3.12-py3-none-any.whl (507 kB)
     |████████████████████████████████| 507 kB 7.5 MB/s 
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from thundersvm) (1.19.5)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from thundersvm) (1.4.1)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from thundersvm) (0.22.2.post1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->thundersvm) (1.1.0)
Installing collected packages: thundersvm
Successfully installed thundersvm-0.3.12

#python wheel available in working directory
ls
sample_data/  thundersvm-cpu-0.2.0-py3-none-linux_x86_64.whl

pip install thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl
WARNING: Requirement 'thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl' looks like a filename, but the file does not exist
Processing ./thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/content/thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl'

#full filepath
pwd
'/content'
pip install content/thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl
WARNING: Requirement 'content/thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl' looks like a filename, but the file does not exist
Processing ./content/thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/content/content/thundersvm-cu90-0.2.0-py3-none-linux_x86_64.whl'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests