Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linker problem with cuBLAS build in Linux #74

Open
delorytheape opened this issue May 15, 2019 · 4 comments
Open

Linker problem with cuBLAS build in Linux #74

delorytheape opened this issue May 15, 2019 · 4 comments
Assignees

Comments

@delorytheape
Copy link

delorytheape commented May 15, 2019

I ran into problems building on a fresh install of Linux (Ubuntu 18.04) when trying to build with the cuBLAS library. Details are
OS: Ubuntu 18.04
GCC: 7.4.0
cmake: 3.14.3
Boost: 1.65.1
CUDA: 10.1.105-1
Nvidia Driver: 418.39-1
GPU: GTX1080ti
Gpufit: 12496a... Apr 12 16:11:54 2019

I used the following commands to build;

mkdir Gpufit-build
cd Gpufit-build
cmake -DCMAKE_BUILD_TYPE=RELEASE -DUSE_CUBLAS=TRUE ../Gpufit
make 

The build failed with the following statements...

../libGpufit.so: undefined reference to 'init_gemm_select'
../libGpufit.so: undefined reference to 'free_gemm_select'
../libGpufit.so: undefined reference to 'cublasLtGetVersion'
../libGpufit.so: undefined reference to 'cublasLtGetProperty'
../libGpufit.so: undefined reference to 'cublasLtCtxInit'
../libGpufit.so: undefined reference to 'cublasLtGetCudartVersion'
../libGpufit.so: undefined reference to 'cublasLtShutdownCtx'

I was able to eventually get around this (and subsequent) problem as follows;
(Note; I do not claim that this is the best way to solve the problem. It is merely what worked for me, and may be of assistance in addressing the actual problem).

  1. After installing cuda toolkit (from .deb package) I had to create symbolic links from the /usr/local/cuda directory to the cuBLAS libraries (for some odd reason with CUDA 10.1, Nvida have put these in a different location /usr/lib/x86_64-linux-gnu). The commands I used were
sudo ln -s /usr/local/cuda/libcublasLt.so /usr/lib/x86_64-linux-gnu/libcublasLt.so
sudo ln -s /usr/local/cuda/libcublasLt.so.10 /usr/lib/x86_64-linux-gnu/libcublasLt.so.10
sudo ln -s /usr/local/cuda/libcublasLt.so10.1.0.105 /usr/lib/x86_64-linux-gnu/libcublasLt.so.10.1.0.105
sudo ln -s /usr/local/cuda/libcublasLt_static.a /usr/lib/x86_64-linux-gnu/libcublasLt_static.a
sudo ln -s /usr/local/cuda/libcublas.so /usr/lib/x86_64-linux-gnu/libcublas.so
sudo ln -s /usr/local/cuda/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10
sudo ln -s /usr/local/cuda/libcublas.so10.1.0.105 /usr/lib/x86_64-linux-gnu/libcublas.so.10.1.0.105
sudo ln -s /usr/local/cuda/libcublas_static.a /usr/lib/x86_64-linux-gnu/libcublas_static.a
  1. To fix the build errors, it was necessary to tell the linker to include the static library libcublasLt_static.a This was achieved by modifying the files
    Gpufit repository path/Gpufit/CMakeLists.txt
    Gpufit repository path/Gpufit/examples/CMakeLists.txt
    Gpufit repository path/Gpufit/Gpufit/examples/CMakeLists.txt
    In each file, every instance of the line
target_link_libraries(${target} ${modules})

was replaced with the following line:

target_link_libraries(${target} ${modules} /usr/local/cuda/lib64/libcublasLt_static.a)

The build directory was then erased

cd <Gpufit repository path>/Gpufit-build
rm -rf *

and the cmake command rerun

cmake -DCMAKE_BUILD_TYPE=RELEASE -DUSE_CUBLAS=TRUE ../Gpufit
make

The build then succeeded.
As previously mentioned, I am not very familiar with the cmake machinery, so there may be a much better way to fix this problem than this workaround. Hope it helps anyway.

@ironictoo
Copy link

I had the same problem, I solved it with the following changes to a single CMake file. The repo path/Gpufit/MakeLists.txt files was edited as follows:

   else()
        set( CUDA_CUBLAS_LIBRARIES 
            /usr/lib/x86_64-linux-gnu/libcublas_static.a 
            /usr/lib/x86_64-linux-gnu/libcublasLt_static.a
            ${CUDA_TOOLKIT_ROOT_DIR}/lib64/libcudart_static.a
            dl
            pthread
            rt
            #${CUDA_TOOLKIT_ROOT_DIR}/lib64/libcublas_static.a
            ${CUDA_TOOLKIT_ROOT_DIR}/lib64/libculibos.a )
    endif()

I had two problems, the cublas libraries weren't in the CUDA_TOOLKIT_ROOT_DIR, and additional libraries were required. The above edit isn't the most robust, but it worked for my platform. Looking how to make this more robust I found the FIND_CUDA module has been deprecated. So someone should really rewrite the repo path/Gpufit/MakeLists.txt file to make use of the native CUDA support in CMake >3.10 now.

@ironictoo
Copy link

After more research the native CMake CUDA support doesn't really have as many features as the deprecated FIND_CUDA, which is probably why it isn't being used. I updated the above to the slightly more robust:

            find_cuda_helper_libs(cublas_static)
            find_cuda_helper_libs(cublasLt_static)
            find_cuda_helper_libs(culibos)

            set( CUDA_CUBLAS_LIBRARIES 
                ${CUDA_cublas_static_LIBRARY}
                ${CUDA_cublasLt_static_LIBRARY}
                ${CUDA_cudart_static_LIBRARY}
                ${CUDA_culibos_LIBRARY}
                dl
                pthread
                rt )

@jkfindeisen
Copy link
Collaborator

I checked it and it may depend on the CUDA version. It may require some additional CMake code and testing. I'll come back to it later.

@jkfindeisen
Copy link
Collaborator

In ed273ca I improved the situation a bit, but I leave this open here until the PR #94 is decided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants