Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA code compile error with clang: function template partial specialization is not allowed #765

Open
zhiweij1 opened this issue Oct 12, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@zhiweij1
Copy link

zhiweij1 commented Oct 12, 2023

Branch/Tag/Commit

main

Docker Image Version

N/A

GPU name

N/A

CUDA Driver

N/A

Reproduced Steps

Smaller reproducer:


template<class T1, class T2>
struct A {
  template <class T3>
  void foo(T3 x);
};

template<typename T1, typename T2>
template<typename T3>
void A<T1, T2>::foo<T3>(T3 x) {

}

https://godbolt.org/z/xn7vre9db

The original code comes from

void CutlassFpAIntBGemmRunner<T, WeightType>::dispatch_to_arch<EpilogueTag>(const T* A,

The syntax of definition of “foo” is incorrect according to https://en.cppreference.com/w/cpp/language/member_template

LLVM community think it is not a bug in clang: llvm/llvm-project#68772

I think FasterTransformer should fix the usage in the source code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant