Support various versions of cuda #162

langley · 2019-05-01T00:54:45Z

I'm exploring using your package on google's colaboratory by leveraging almond.sh
The non-gpu version works fine, THANKS this is awesome!

But when I try and use the linux-gpu-x86_64 version, I get the following error:

java.lang.UnsatisfiedLinkError: /tmp/tensorflow_scala_native_libraries13086553369265287406/libtensorflow_jni.so: libcublas.so.9.0: cannot open shared object file: No such file or directory

I believe this is similar to tensor flow issue #15604

When I checked what was installed under /usr/local I see
...
lrwxrwxrwx 1 root root 9 Apr 4 20:13 cuda -> cuda-10.0
drwxr-xr-x 1 root root 4096 Apr 4 20:11 cuda-10.0
...

My guess is that the "fix" for this is to compile tensorflow from source against that version of the cuda libraries. So I have a couple of questions before I embark on that.

would this "compile from source" work?
would the transition to "TF 2.0" cause problems? I don't know the details of the tensorflow APIS and tensorflow_scala well enough yet to know if the changes for TF 2.0 will cause significant problems for tensorflow_scala.

DirkToewe · 2019-06-22T09:49:51Z

Hi @langley

Yeah recompilation from source should work. As far as i know, TF4S only depends on the C API libtensorflow.so so You should be able to compile Tensorflow with different versions of Cuda or even SyCl or ROCm.
The C API in TF2.0 is very likely backwards compatible. So TF4S should work with it as well. The TF4S API however is going to remain the same. Gradients in eager mode for example are not yet supported.

Sadly, compiling TF from sources with Cuda is a quite a bit of trial and error. Here are some things I learned the hard way:

Use -D_GLIBCXX_USE_CXX11_ABI=0, otherwise libtensorflow.so won't work with TF4S.
To improve CPU performance, You can use -o3 -march=native as compiler flags. If You are compiling on another CPU than You run the code on, You can enable individual vector extensions
one by one like -msse -msse2 -msse3 -mssse3 -msse4 -msse4.1 -msse4.2 -mavx -mavx2 depending
on which extensions are supported by the target CPU.
I believe Cuda 10 does not support gcc-8, so make sure You use gcc-7.
To load additional shared objects in Java, You can use System.load.

jxtps · 2019-07-27T04:27:41Z

Would it make sense for TF4S to use JavaCPP Presets for TensorFlow?

They have convenient packaging of all the binaries for several platforms (linux, mac, windows), and GPU support is just a maven include away.

I realize that the C++ Tensorflow is different from the C API, but they have some instructions for how to create new presets.

nazarblch · 2019-08-02T18:33:48Z

JavaCPP requires a lot of handwork. In most cases it is difficult to generate suitable Java interfaces. I have used JavaCPP for PyTorch api, you may find it here: https://github.com/nazarblch/torch-scala

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support various versions of cuda #162

Support various versions of cuda #162

langley commented May 1, 2019

DirkToewe commented Jun 22, 2019

jxtps commented Jul 27, 2019

nazarblch commented Aug 2, 2019

Support various versions of cuda #162

Support various versions of cuda #162

Comments

langley commented May 1, 2019

DirkToewe commented Jun 22, 2019

jxtps commented Jul 27, 2019

nazarblch commented Aug 2, 2019