Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compilation error in gpu example gpu_device_timer #244

Open
pkestene opened this issue Feb 6, 2022 · 4 comments
Open

compilation error in gpu example gpu_device_timer #244

pkestene opened this issue Feb 6, 2022 · 4 comments

Comments

@pkestene
Copy link

pkestene commented Feb 6, 2022

Hello,

i'm new to timemory.
I was just trying to build with cuda/gpu support, and I have a compilation error when building gpu examples.
It is a bit weird to me. The compiler doesn't seem to be enable to find the right overload of data_tracker::store; I don't see anything wrong in the code.

Here is the full compilation command and the error:

[ 93%] Building CUDA object examples/ex-gpu/v3/CMakeFiles/ex_kernel_instrument_v3.dir/gpu_device_timer.cpp.o
cd /home/pkestene/install/timemory/git/timemory/build/cuda/examples/ex-gpu/v3 && /usr/local/cuda-11.6/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/c++ -DTIMEMORY_CMAKE -DTIMEMORY_USE_BACKENDS_EXTERN -DTIMEMORY_USE_COMMON_EXTERN -DTIMEMORY_USE_COMPONENT_EXTERN -DTIMEMORY_USE_CONFIG_EXTERN -DTIMEMORY_USE_CONTAINERS_EXTERN -DTIMEMORY_USE_CORE_EXTERN -DTIMEMORY_USE_CUDA -DTIMEMORY_USE_CUDA_EXTERN -DTIMEMORY_USE_DATA_TRACKER_EXTERN -DTIMEMORY_USE_ERT_EXTERN -DTIMEMORY_USE_EXTERN -DTIMEMORY_USE_GPU -DTIMEMORY_USE_IO_EXTERN -DTIMEMORY_USE_LIBUNWIND -DTIMEMORY_USE_MANAGER_EXTERN -DTIMEMORY_USE_NETWORK_EXTERN -DTIMEMORY_USE_NVTX -DTIMEMORY_USE_OPERATIONS_EXTERN -DTIMEMORY_USE_PRINTER_EXTERN -DTIMEMORY_USE_RUNTIME_EXTERN -DTIMEMORY_USE_RUSAGE_EXTERN -DTIMEMORY_USE_STATISTICS -DTIMEMORY_USE_STORAGE_EXTERN -DTIMEMORY_USE_TIMESTAMP_EXTERN -DTIMEMORY_USE_TIMING_EXTERN -DTIMEMORY_USE_TRIP_COUNT_EXTERN -DTIMEMORY_USE_USER_BUNDLE_EXTERN -DTIMEMORY_USE_VARIADIC_EXTERN -DTIMEMORY_USE_XML -DTIMEMORY_VEC=256 -DUNW_LOCAL_ONLY -Dex_kernel_instrument_v3_EXPORTS -I/home/pkestene/install/timemory/git/timemory/build/cuda/source -I/home/pkestene/install/timemory/git/timemory/source -I/usr/local/cuda-11.6/include -isystem=/usr/local/cuda-11.6/targets/x86_64-linux/include -arch=sm_75 -O3 -DNDEBUG --generate-code=arch=compute_75,code=[compute_75,sm_75] -arch=sm_75 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 --extended-lambda -Xcompiler=-W -Xcompiler=-Wall -Xcompiler=-Wno-unknown-pragmas -Xcompiler=-Wno-ignored-attributes -Xcompiler=-Wno-attributes -Xcompiler=-Wno-missing-field-initializers -Xcompiler=-Wno-class-memaccess -Xcompiler=-fno-signaling-nans -Xcompiler=-fno-trapping-math -Xcompiler=-fno-signed-zeros -Xcompiler=-ffinite-math-only -Xcompiler=-fno-math-errno -Xcompiler=-fpredictive-commoning -Xcompiler=-fvariable-expansion-in-unroller -Xcompiler=-faligned-new -Xcompiler=-ftls-model=initial-exec -Xcompiler=-rdynamic -Xcompiler=-finline-functions -Xcompiler=-funroll-loops -Xcompiler=-ftree-vectorize -Xcompiler=-ftree-loop-optimize -Xcompiler=-ftree-loop-vectorize -lineinfo -std=c++14 -x cu -c /home/pkestene/install/timemory/git/timemory/examples/ex-gpu/v3/gpu_device_timer.cpp -o CMakeFiles/ex_kernel_instrument_v3.dir/gpu_device_timer.cpp.o
/home/pkestene/install/timemory/git/timemory/examples/ex-gpu/v3/gpu_device_timer.hpp(134): warning #177-D: variable "_data" was declared but never referenced

/home/pkestene/install/timemory/git/timemory/source/timemory/components/data_tracker/components.hpp(677): error: no instance of overloaded function "tim::component::data_tracker<InpT, Tag>::store [with InpT=double, Tag=gpu_data_tag]" matches the argument list
            argument types are: (std::plus<double>, double)
            object type is: tim::component::data_tracker<double, gpu_data_tag>
          detected during instantiation of "tim::component::data_tracker<InpT, Tag>::this_type *tim::component::data_tracker<InpT, Tag>::add_secondary(const std::string &, FuncT &&, T &&, tim::component::data_tracker<InpT, Tag>::enable_if_acceptable_t<T, int>) [with InpT=double, Tag=gpu_data_tag, FuncT=std::plus<double>, T=double &]" 
/home/pkestene/install/timemory/git/timemory/examples/ex-gpu/v3/gpu_device_timer.cpp(90): here

The host compiler is g++-11, but I tried g++-10 also, the error is stil there.

Any help appreciated.

@jrmadsen
Copy link
Collaborator

jrmadsen commented Feb 8, 2022

Interesting... that overload is used quite often. Could you try replacing std::plus<double>{} with a lambda, e.g. [](double lhs, double rhs) { return lhs + rhs; }?

@jrmadsen
Copy link
Collaborator

jrmadsen commented Feb 8, 2022

Ah based on this [ 93%] Building CUDA object examples/ex-gpu/v3/CMakeFiles/ex_kernel_instrument_v3.dir/gpu_device_timer.cpp.o, I think this might be an NVCC bug. Unfortunately NVCC is quite unreliable when it comes to templates. If the above fails, could you try another CUDA version instead of a different GCC version to try to verify it is a CUDA 11.6 bug?

@pkestene
Copy link
Author

pkestene commented Feb 8, 2022

Thanks for your answer, unfortunately :

  • same error with cuda toolkit 11.5.2
  • if I change std::plus<double>{} into [](double lhs, double rhs) { return lhs + rhs; }, the error is similar
/data/pkestene/install/timemory/git/timemory/source/timemory/components/data_tracker/components.hpp(677): error: no instance of overloaded function "tim::component::data_tracker<InpT, Tag>::store [with InpT=double, Tag=gpu_data_tag]" matches the argument list
            argument types are: (lambda [](double, double)->double, double)
            object type is: tim::component::data_tracker<double, gpu_data_tag>
          detected during instantiation of "tim::component::data_tracker<InpT, Tag>::this_type *tim::component::data_tracker<InpT, Tag>::add_secondary(const std::string &, FuncT &&, T &&, tim::component::data_tracker<InpT, Tag>::enable_if_acceptable_t<T, int>) [with InpT=double, Tag=gpu_data_tag, FuncT=lambda [](double, double)->double, T=double &]" 
/data/pkestene/install/timemory/git/timemory/examples/ex-gpu/v3/gpu_device_timer.cpp(92): here

@jrmadsen
Copy link
Collaborator

jrmadsen commented Feb 8, 2022

Yeah, I was able to reproduce it. It is definitely a NVCC bug -- if I make the necessary changes to compile gpu_device_timer.cpp and gpu_op_tracker.cpp with the host compiler (basically guarding the kernel launches and device functions with #if defined(TIMEMORY_GPUCC) and tweaking the CMakeLists.txt to only set ex_kernel_instrument.cpp as a CUDA source) then it compiles and runs fine. Let me think a bit more on how this should be handled and get back to you bc I am getting tired of having to create workarounds for templates with NVCC, e.g. #237.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants