CUDA code compile error with clang: function template partial specialization is not allowed #765

zhiweij1 · 2023-10-12T00:40:13Z

Branch/Tag/Commit

main

Docker Image Version

N/A

GPU name

N/A

CUDA Driver

N/A

Reproduced Steps

Smaller reproducer:


template<class T1, class T2>
struct A {
  template <class T3>
  void foo(T3 x);
};

template<typename T1, typename T2>
template<typename T3>
void A<T1, T2>::foo<T3>(T3 x) {

}

https://godbolt.org/z/xn7vre9db

The original code comes from

FasterTransformer/src/fastertransformer/kernels/cutlass_kernels/fpA_intB_gemm/fpA_intB_gemm_template.h

Line 426 in afdf9a9

    
           void CutlassFpAIntBGemmRunner<T, WeightType>::dispatch_to_arch<EpilogueTag>(const T*          A,

The syntax of definition of “foo” is incorrect according to https://en.cppreference.com/w/cpp/language/member_template

LLVM community think it is not a bug in clang: llvm/llvm-project#68772

I think FasterTransformer should fix the usage in the source code.

zhiweij1 added the bug Something isn't working label Oct 12, 2023

zhiweij1 mentioned this issue Oct 12, 2023

CUDA code compile error: function template partial specialization is not allowed llvm/llvm-project#68772

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA code compile error with clang: function template partial specialization is not allowed #765

CUDA code compile error with clang: function template partial specialization is not allowed #765

zhiweij1 commented Oct 12, 2023 •

edited

CUDA code compile error with clang: function template partial specialization is not allowed #765

CUDA code compile error with clang: function template partial specialization is not allowed #765

Comments

zhiweij1 commented Oct 12, 2023 • edited

Branch/Tag/Commit

Docker Image Version

GPU name

CUDA Driver

Reproduced Steps

zhiweij1 commented Oct 12, 2023 •

edited