Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include/ctranslate2/ops/flash-attention/flash_fwd_launch_template.h(15): error: identifier "__grid_constant__" is undefined #1664

Open
twmht opened this issue Apr 12, 2024 · 3 comments

Comments

@twmht
Copy link

twmht commented Apr 12, 2024

I am using the Orin-NX with CUDA version 11.4. The following error occurs during compilation:

(jarvis) aaeon@BOXER-8651AI:~/CTranslate2/build$ cmake .. -DWITH_MKL=OFF -DWITH_CUDA=ON -DWITH_CUDNN=ON -DOPENMP_RUNTIME=NONE
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Build spdlog: 1.10.0
-- Build type: Release
-- Compiling for multiple CPU ISA and enabling runtime dispatch
-- Found CUDA: /usr/local/cuda-11.4 (found suitable version "11.4", minimum required is "11.0")
-- Autodetected CUDA architecture(s):  8.7
-- NVCC host compiler: /usr/bin/c++
-- NVCC compilation flags: -std=c++17;-gencode;arch=compute_87,code=sm_87;--expt-relaxed-constexpr;--expt-extended-lambda;--use_fast_math
-- Found cuDNN include directory: /usr/include
-- Found cuDNN libraries: /usr/lib/aarch64-linux-gnu/libcudnn.so
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success
-- Configuring done
-- Generating done
-- Build files have been written to: /home/aaeon/CTranslate2/build
(jarvis) aaeon@BOXER-8651AI:~/CTranslate2/build$ make -j4
[  0%] Building NVCC (Device) object CMakeFiles/ctranslate2.dir/src/cuda/ctranslate2_generated_primitives.cu.o
[  0%] Building NVCC (Device) object CMakeFiles/ctranslate2.dir/src/ops/ctranslate2_generated_alibi_add_gpu.cu.o
[  2%] Building NVCC (Device) object CMakeFiles/ctranslate2.dir/src/cuda/ctranslate2_generated_random.cu.o
[  2%] Building NVCC (Device) object CMakeFiles/ctranslate2.dir/src/ops/flash-attention/ctranslate2_generated_flash_fwd_split_hdim256_fp16_sm80.cu.o
/home/aaeon/CTranslate2/include/ctranslate2/ops/flash-attention/flash_fwd_launch_template.h(15): warning: attribute "__global__" does not apply here

/home/aaeon/CTranslate2/include/ctranslate2/ops/flash-attention/flash_fwd_launch_template.h(15): error: incomplete type is not allowed

/home/aaeon/CTranslate2/include/ctranslate2/ops/flash-attention/flash_fwd_launch_template.h(15): error: identifier "__grid_constant__" is undefined

/home/aaeon/CTranslate2/include/ctranslate2/ops/flash-attention/flash_fwd_launch_template.h(15): error: expected a ")"

/home/aaeon/CTranslate2/include/ctranslate2/ops/flash-attention/flash_fwd_launch_template.h(15): error: expected a ";"

4 errors detected in the compilation of "/home/aaeon/CTranslate2/src/ops/flash-attention/flash_fwd_split_hdim256_fp16_sm80.cu".
CMake Error at ctranslate2_generated_flash_fwd_split_hdim256_fp16_sm80.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/aaeon/CTranslate2/build/CMakeFiles/ctranslate2.dir/src/ops/flash-attention/./ctranslate2_generated_flash_fwd_split_hdim256_fp16_sm80.cu.o


make[2]: *** [CMakeFiles/ctranslate2.dir/build.make:429: CMakeFiles/ctranslate2.dir/src/ops/flash-attention/ctranslate2_generated_flash_fwd_split_hdim256_fp16_sm80.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
i^Cmake[2]: *** [CMakeFiles/ctranslate2.dir/build.make:65: CMakeFiles/ctranslate2.dir/src/cuda/ctranslate2_generated_primitives.cu.o] Interrupt
make[1]: *** [CMakeFiles/Makefile2:114: CMakeFiles/ctranslate2.dir/all] Interrupt
make: *** [Makefile:130: all] Interrupt

Any idea?

@twmht
Copy link
Author

twmht commented Apr 12, 2024

does grid_constant only work for cuda 11.7 and newer version?

@minhthuc2502
Copy link
Collaborator

minhthuc2502 commented Apr 12, 2024

Following the documentation of CUDA, I only know that __grid_constant__ is supported from sm 70. Not sure from which version of CUDA we have this. BTW, you can try to remove this __grid_constant__. It should work even without this macro.

@BBC-Esq
Copy link

BBC-Esq commented Apr 25, 2024

Was this resolved? As I sit drinking my morning coffee reading about one of my favorite libraries, ctranslate2, I don't want to waste time reading about issues that have been resolved. @twmht how'd it go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants