Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support various versions of cuda #162

Open
langley opened this issue May 1, 2019 · 3 comments
Open

Support various versions of cuda #162

langley opened this issue May 1, 2019 · 3 comments

Comments

@langley
Copy link

langley commented May 1, 2019

I'm exploring using your package on google's colaboratory by leveraging almond.sh
The non-gpu version works fine, THANKS this is awesome!

But when I try and use the linux-gpu-x86_64 version, I get the following error:

java.lang.UnsatisfiedLinkError: /tmp/tensorflow_scala_native_libraries13086553369265287406/libtensorflow_jni.so: libcublas.so.9.0: cannot open shared object file: No such file or directory

I believe this is similar to tensor flow issue #15604

When I checked what was installed under /usr/local I see
...
lrwxrwxrwx 1 root root 9 Apr 4 20:13 cuda -> cuda-10.0
drwxr-xr-x 1 root root 4096 Apr 4 20:11 cuda-10.0
...

My guess is that the "fix" for this is to compile tensorflow from source against that version of the cuda libraries. So I have a couple of questions before I embark on that.

  1. would this "compile from source" work?
  2. would the transition to "TF 2.0" cause problems? I don't know the details of the tensorflow APIS and tensorflow_scala well enough yet to know if the changes for TF 2.0 will cause significant problems for tensorflow_scala.
@DirkToewe
Copy link
Contributor

Hi @langley

  1. Yeah recompilation from source should work. As far as i know, TF4S only depends on the C API libtensorflow.so so You should be able to compile Tensorflow with different versions of Cuda or even SyCl or ROCm.
  2. The C API in TF2.0 is very likely backwards compatible. So TF4S should work with it as well. The TF4S API however is going to remain the same. Gradients in eager mode for example are not yet supported.

Sadly, compiling TF from sources with Cuda is a quite a bit of trial and error. Here are some things I learned the hard way:

  • Use -D_GLIBCXX_USE_CXX11_ABI=0, otherwise libtensorflow.so won't work with TF4S.
  • To improve CPU performance, You can use -o3 -march=native as compiler flags. If You are compiling on another CPU than You run the code on, You can enable individual vector extensions
    one by one like -msse -msse2 -msse3 -mssse3 -msse4 -msse4.1 -msse4.2 -mavx -mavx2 depending
    on which extensions are supported by the target CPU.
  • I believe Cuda 10 does not support gcc-8, so make sure You use gcc-7.
  • To load additional shared objects in Java, You can use System.load.

@jxtps
Copy link

jxtps commented Jul 27, 2019

Would it make sense for TF4S to use JavaCPP Presets for TensorFlow?

They have convenient packaging of all the binaries for several platforms (linux, mac, windows), and GPU support is just a maven include away.

I realize that the C++ Tensorflow is different from the C API, but they have some instructions for how to create new presets.

@nazarblch
Copy link

JavaCPP requires a lot of handwork. In most cases it is difficult to generate suitable Java interfaces. I have used JavaCPP for PyTorch api, you may find it here: https://github.com/nazarblch/torch-scala

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants