Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Error when compiling "ModuleNotFoundError: No module named 'Tensile.TensileCreateLibrary'" #1296

Open
jbaileyhandle opened this issue Mar 4, 2023 · 10 comments
Assignees

Comments

@jbaileyhandle
Copy link

Describe the bug

When I try to compile from source, I get the error "ModuleNotFoundError: No module named 'Tensile.TensileCreateLibrary'"

To Reproduce

rocBlas commit hash: 4a92c6f on branch release/rocm-rel-5.2
I run ./install.sh (I have previously run ./install -d to install dependencies)

Expected behavior

rocBLAS compiles

Log-files

Output from the install.sh script:
`+ [[ true == true ]]

  • rm -rf /home/jbaile/gpu2/apps/rocBLAS/build/release
  • cmake_executable=cmake
  • [[ true == true ]]
  • cxx=hipcc
  • cc=hipcc
  • fc=gfortran
  • [[ true == true ]]
  • export PATH=/opt/rocm/bin:/opt/rocm/hip/bin:/opt/rocm/llvm/bin:/home/jbaile/miniconda3/bin:/home/jbaile/miniconda3/condabin:/home/jbaile/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/rocm-5.2.3/bin:/opt/rocm-5.2.3/opencl/bin:/home/jbaile/amd-gpu-tools/ominprof/install/1.0.6/bin:/opt/rocm-5.2.3/bin:/opt/rocm-5.2.3/opencl/bin:/home/jbaile/amd-gpu-tools/ominprof/install/1.0.6/bin
  • PATH=/opt/rocm/bin:/opt/rocm/hip/bin:/opt/rocm/llvm/bin:/home/jbaile/miniconda3/bin:/home/jbaile/miniconda3/condabin:/home/jbaile/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/rocm-5.2.3/bin:/opt/rocm-5.2.3/opencl/bin:/home/jbaile/amd-gpu-tools/ominprof/install/1.0.6/bin:/opt/rocm-5.2.3/bin:/opt/rocm-5.2.3/opencl/bin:/home/jbaile/amd-gpu-tools/ominprof/install/1.0.6/bin
  • [[ false == true ]]
  • [[ false == true ]]
  • pushd .
    ~/gpu2/apps/rocBLAS ~/gpu2/apps/rocBLAS
  • cmake_common_options=
  • cmake_client_options=
  • cmake_common_options=' -DROCM_PATH=/opt/rocm -DAMDGPU_TARGETS=all'
  • [[ true == true ]]
  • mkdir -p /home/jbaile/gpu2/apps/rocBLAS/build/release/clients
  • cd /home/jbaile/gpu2/apps/rocBLAS/build/release
  • cmake_common_options=' -DROCM_PATH=/opt/rocm -DAMDGPU_TARGETS=all -DCMAKE_BUILD_TYPE=Release'
  • [[ false == true ]]
  • [[ false == true ]]
  • [[ -n '' ]]
  • [[ -n '' ]]
  • [[ -n '' ]]
  • [[ false == true ]]
  • [[ -n '' ]]
  • tensile_opt=
  • [[ true == false ]]
  • tensile_opt=' -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3'
    ++ nproc
  • [[ 8 != 8 ]]
  • [[ '' == false ]]
  • [[ true == true ]]
  • tensile_opt=' -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3 -DTensile_SEPARATE_ARCHITECTURES=ON'
  • [[ true == true ]]
  • tensile_opt=' -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3 -DTensile_SEPARATE_ARCHITECTURES=ON -DTensile_LIBRARY_FORMAT=msgpack'
  • cmake_common_options='-DCMAKE_TOOLCHAIN_FILE=toolchain-linux.cmake -DROCM_PATH=/opt/rocm -DAMDGPU_TARGETS=all -DCMAKE_BUILD_TYPE=Release -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3 -DTensile_SEPARATE_ARCHITECTURES=ON -DTensile_LIBRARY_FORMAT=msgpack'
  • [[ false == true ]]
  • [[ true == false ]]
  • [[ true == true ]]
  • cmake_comm
    environment.txt
    on_options='-DCMAKE_TOOLCHAIN_FILE=toolchain-linux.cmake -DROCM_PATH=/opt/rocm -DAMDGPU_TARGETS=all -DCMAKE_BUILD_TYPE=Release -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3 -DTensile_SEPARATE_ARCHITECTURES=ON -DTensile_LIBRARY_FORMAT=msgpack -DRUN_HEADER_TESTING=OFF'
  • [[ false == true ]]
  • [[ false == true ]]
  • [[ true == true ]]
  • cmake_common_options='-DCMAKE_TOOLCHAIN_FILE=toolchain-linux.cmake -DROCM_PATH=/opt/rocm -DAMDGPU_TARGETS=all -DCMAKE_BUILD_TYPE=Release -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3 -DTensile_SEPARATE_ARCHITECTURES=ON -DTensile_LIBRARY_FORMAT=msgpack -DRUN_HEADER_TESTING=OFF -DBUILD_FILE_REORG_BACKWARD_COMPATIBILITY=ON'
  • [[ false == true ]]
  • CXX=hipcc
  • CC=hipcc
  • cmake -DCMAKE_TOOLCHAIN_FILE=toolchain-linux.cmake -DROCM_PATH=/opt/rocm -DAMDGPU_TARGETS=all -DCMAKE_BUILD_TYPE=Release -DTensile_LOGIC=asm_full -DTensile_CODE_OBJECT_VERSION=V3 -DTensile_SEPARATE_ARCHITECTURES=ON -DTensile_LIBRARY_FORMAT=msgpack -DRUN_HEADER_TESTING=OFF -DBUILD_FILE_REORG_BACKWARD_COMPATIBILITY=ON -DCPACK_SET_DESTDIR=OFF -DCMAKE_INSTALL_PREFIX=rocblas-install -DCPACK_PACKAGING_INSTALL_PREFIX=/opt/rocm /home/jbaile/gpu2/apps/rocBLAS
    -- The CXX compiler identification is Clang 17.0.0
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /opt/rocm/bin/hipcc - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Use hip-clang to build for amdgpu backend
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
    -- Found Threads: TRUE
    -- OS detected is ubuntu
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx803
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx803 - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx900
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx900 - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx1010
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx1010 - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx1012
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx1012 - Success
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030
    -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Success
    /home/jbaile/miniconda3/bin/python3 -m venv /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv --system-site-packages --clear
    /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/bin/python3 -m pip install git+https://github.com/ROCmSoftwarePlatform/Tensile.git@9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db
    Collecting git+https://github.com/ROCmSoftwarePlatform/Tensile.git@9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db
    Cloning https://github.com/ROCmSoftwarePlatform/Tensile.git (to revision 9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db) to /tmp/pip-req-build-anopgidp
    Running command git clone --filter=blob:none --quiet https://github.com/ROCmSoftwarePlatform/Tensile.git /tmp/pip-req-build-anopgidp
    Running command git rev-parse -q --verify 'sha^9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db'
    Running command git fetch -q https://github.com/ROCmSoftwarePlatform/Tensile.git 9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db
    Running command git checkout -q 9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db
    Resolved https://github.com/ROCmSoftwarePlatform/Tensile.git to commit 9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db
    Running command git submodule update --init --recursive -q
    Preparing metadata (setup.py): started
    Preparing metadata (setup.py): finished with status 'done'
    Requirement already satisfied: pyyaml in /home/jbaile/amd-gpu-tools/ominprof/install/python-libs (from Tensile==4.33.0) (6.0)
    Requirement already satisfied: msgpack in /home/jbaile/miniconda3/lib/python3.9/site-packages (from Tensile==4.33.0) (1.0.4)
    Building wheels for collected packages: Tensile
    Building wheel for Tensile (setup.py): started
    Building wheel for Tensile (setup.py): finished with status 'done'
    Created wheel for Tensile: filename=Tensile-4.33.0-py3-none-any.whl size=4565859 sha256=7e0f1e3aecefd8668f97f75348bcd1744de61be5c15c433724042c5ec36acc97
    Stored in directory: /home/jbaile/.cache/pip/wheels/b5/9e/21/77227820d40984eba9330194f9795fb10b8068e1d6f9a1e774
    Successfully built Tensile
    Installing collected packages: Tensile
    Successfully installed Tensile-4.33.0
    WARNING: You are using pip version 22.0.4; however, version 23.0.1 is available.
    You should consider upgrading via the '/home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/bin/python3 -m pip install --upgrade pip' command.
    -- using GIT Tensile fork=ROCmSoftwarePlatform from branch=9ca08f38c4c3bfe6dfa02233637e7e3758c7b6db
    -- Adding /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv to CMAKE_PREFIX_PATH
    -- The C compiler identification is Clang 17.0.0
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working C compiler: /opt/rocm/bin/hipcc - skipped
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- hip::amdhip64 is SHARED_LIBRARY
    -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
    -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
    -- hip::amdhip64 is SHARED_LIBRARY
    -- Using AMDGPU_TARGETS: gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx1010;gfx1012;gfx1030
    -- Tensile script: /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/lib/python3.9/site-packages/Tensile/bin/TensileCreateLibrary
    -- Tensile_CREATE_COMMAND: /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/lib/python3.9/site-packages/Tensile/bin/TensileCreateLibrary;--merge-files;--separate-architectures;--no-short-file-names;--no-library-print-debug;--code-object-version=V3;--cxx-compiler=hipcc;--library-format=msgpack;--architecture=gfx803_gfx900_gfx906:xnack-_gfx908:xnack-_gfx90a:xnack+_gfx90a:xnack-_gfx1010_gfx1012_gfx1030;/home/jbaile/gpu2/apps/rocBLAS/library/src/blas3/Tensile/Logic/asm_full;/home/jbaile/gpu2/apps/rocBLAS/build/release/Tensile;HIP
    -- Tensile_MANIFEST_FILE_PATH: /home/jbaile/gpu2/apps/rocBLAS/build/release/Tensile/library/TensileManifest.txt
    '/home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/lib/python3.9/site-packages/Tensile/bin/TensileCreateLibrary' '--merge-files' '--separate-architectures' '--no-short-file-names' '--no-library-print-debug' '--code-object-version=V3' '--cxx-compiler=hipcc' '--library-format=msgpack' '--architecture=gfx803_gfx900_gfx906:xnack-_gfx908:xnack-_gfx90a:xnack+_gfx90a:xnack-_gfx1010_gfx1012_gfx1030' '/home/jbaile/gpu2/apps/rocBLAS/library/src/blas3/Tensile/Logic/asm_full' '/home/jbaile/gpu2/apps/rocBLAS/build/release/Tensile' 'HIP' '--generate-manifest-and-exit'
    Traceback (most recent call last):
    File "/home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/lib/python3.9/site-packages/Tensile/bin/TensileCreateLibrary", line 25, in
    from Tensile.TensileCreateLibrary import TensileCreateLibrary
    ModuleNotFoundError: No module named 'Tensile.TensileCreateLibrary'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/lib/python3.9/site-packages/Tensile/bin/TensileCreateLibrary", line 32, in
from Tensile.TensileCreateLibrary import TensileCreateLibrary
ModuleNotFoundError: No module named 'Tensile.TensileCreateLibrary'
CMake Error at build/release/virtualenv/cmake/TensileConfig.cmake:251 (message):
Error creating Tensile library: 1
Call Stack (most recent call first):
library/src/CMakeLists.txt:79 (TensileCreateLibraryFiles)

-- Configuring incomplete, errors occurred!
See also "/home/jbaile/gpu2/apps/rocBLAS/build/release/CMakeFiles/CMakeOutput.log".

  • check_exit_code 1
  • (( 1 != 0 ))
  • exit 1
    `

Environment

Hardware description
CPU Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
GPU AMD Radeon VII
Software version
rocm-core rocm-5.2.3
rocblas Building from source branch release/rocm-rel-5.2, commit hash 4a92c6f

Additional context

environment.txt

@rkamd rkamd self-assigned this Mar 6, 2023
@rkamd
Copy link
Contributor

rkamd commented Mar 6, 2023

@jbaileyhandle,
This issue cannot be reproduced on my local machine and as well as Ubuntu 20.04 rocm docker.

It could be an issue specific to your environment. I also see that there is a warning message about difference in pip versions.
Try using --upgrade_tensile_venv_pip build option to see if the warning goes away and also if that makes any difference.

@jbaileyhandle
Copy link
Author

jbaileyhandle commented Mar 7, 2023

install.sh does not recognized this command option. I get:

./install.sh --upgrade_tensile_venv_pip
./install.sh: unrecognized option '--upgrade_tensile_venv_pip'

I'm guessing this is an option that exists for more current branches but not for release/rocm-rel-5.2?

@rkamd
Copy link
Contributor

rkamd commented Mar 7, 2023

Yes, it was added in a later release.

Please provide us the output of the following,
cat /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/pyvenv.cfg
List of packages visible to the virtual environment , type the following command
source /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/bin/activate
pip list >& venv_pip_list.txt

@rkamd
Copy link
Contributor

rkamd commented Mar 7, 2023

@jbaileyhandle ,You can also try to upgrade the PIP version before building rocBLAS

@jbaileyhandle
Copy link
Author

Upgrading PIP resolved the issue. Thank you so much for your help.

@jbaileyhandle
Copy link
Author

jbaileyhandle commented Mar 30, 2023

A followup on this. I'm not sure upgrading pip was what got things working.
After playing around with recompiling rocBLAS, it seems the thing that made the difference was activating the virtual environment (source /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/bin/activate). If I don't do this, I get the error from the original post. If I do activate the virtual environment, the library compiles.

This seems a little unintuitive in that (1) I didn't see anything about this in the documentation for building from source (https://rocblas.readthedocs.io/en/latest/Linux_Install_Guide.html#building-and-installing-rocblas), though it's possible I missed something and (2) The virtual environment that needs to be activated is in the build/release directory, which is created by the install.sh script. But the install.sh script will fail to build the library unless I activate the virtual environment before running the script. So in effect, starting from scratch, I would need to run this script twice.

@TorreZuk
Copy link
Contributor

Thanks for your update @jbaileyhandle. I too was concerned about the state of the virtualenv as cmake configure step likely activates and runs a partial processing step for Tensile via cmake execute_process but anything deferred to the build step may slip through the cracks back to the original python state. Some targets in the Tensile side might need to be revised to clearly flag when they are built (defer from configure step) and we might need to add explicit custom pre_build steps to reactivate the virtualenv and possibly deactivate in custom cmake post_build step. Not sure if this activation failure is caused by your python & cmake versions (you also have a conda python early in your path) as to why we don't see the failure.

The install.sh script is supposed to be self contained so we need to see what can be fixed/simplified. I see "The CXX compiler identification is Clang 17.0.0" which is some newer clang : you should to clear other compilers out of your path or you'll mix the clang in /opt/rocm/llvm with your newer clang17. Also do you have a two different rocm installed in /opt/rocm or is it just a symlink to 5.2.3?

@jbaileyhandle
Copy link
Author

"Also do you have a two different rocm installed in /opt/rocm or is it just a symlink to 5.2.3?"
It is just a symlink.

yoichiyoshida pushed a commit to yoichiyoshida/rocBLAS that referenced this issue May 10, 2023
Palamida compliance in CI files
@jinz2014
Copy link

jinz2014 commented Jul 2, 2023

I could reproduce the issue reported. Upgrading pip does not solve the error.
"
After playing around with recompiling rocBLAS, it seems the thing that made the difference was activating the virtual environment (source /home/jbaile/gpu2/apps/rocBLAS/build/release/virtualenv/bin/activate). If I don't do this, I get the error from the original post. If I do activate the virtual environment, the library compiles. " This is the solution. Thanks.

@cgmb
Copy link
Contributor

cgmb commented Jul 4, 2023

Just FYI, on the develop branch it is now possible to use a -DBUILD_WITH_PIP=OFF CMake option to entirely avoid the use of pip/virtualenv during the rocBLAS build. When passed, the rocBLAS build script will not install Tensile with pip/virtualenv, but you may nevertheless find it helpful. It skips all the tinkering with the Python environment that the rocBLAS build would normally do and instead relies on you to ensure that Tensile and all its dependencies are available.

If you have cloned the Tensile repo to your home directory and checked out the commit corresponding to the tensile_tag.txt for your version of rocBLAS, you can build with:

./install.sh -id \
  --cmake-darg CMAKE_PREFIX_PATH=$HOME/Tensile/Tensile/cmake \
  --cmake-darg BUILD_WITH_PIP=OFF \
  --cmake-darg Tensile_ROOT=$HOME/Tensile/Tensile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants