Inquiry about Minimum GPU Requirements #166

loiqy · 2024-03-25T11:59:08Z

Hi there,

I encountered an error while installing the project and received the following message:

ptxas /tmp/tmpxft_0010a9a0_00000000-6_gemm_cuda_gen.ptx, line 709; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010a9a0_00000000-6_gemm_cuda_gen.ptx, line 713; error   : Feature '.m16n8k16' requires .target sm_80 or higher
...

Can i run AWQ on RTX2080Ti?
Could you please provide information on the minimum GPU requirements for this project? Specifically, what is the required Compute Capability for the GPU?

My Env

OS: Ubuntu 20.04.6 LTS (GNU/Linux 5.4.0-172-generic x86_64)
GPU: 2 x NVIDIA GeForce RTX 2080 Ti
Driver Version: 550.54.14
CUDA: 12.1
Python: 3.10.14

`nvcc -V`

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

Output

(awq) llq@accepted:~/WorkSpace/Quantization/llm-awq/awq/kernels$ python setup.py install
running install
/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/setuptools/_distutils/cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  self.initialize_options()
running bdist_egg
running egg_info
writing awq_inference_engine.egg-info/PKG-INFO
writing dependency_links to awq_inference_engine.egg-info/dependency_links.txt
writing requirements to awq_inference_engine.egg-info/requires.txt
writing top-level names to awq_inference_engine.egg-info/top_level.txt
/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/utils/cpp_extension.py:500: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'awq_inference_engine.egg-info/SOURCES.txt'
writing manifest file 'awq_inference_engine.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/utils/cpp_extension.py:425: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
  warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'awq_inference_engine' extension
/home/llq/cuda-12.1/bin/nvcc -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/TH -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/THC -I/home/llq/cuda-12.1/include -I/home/llq/miniconda3/envs/awq/include/python3.10 -c csrc/attention/decoder_masked_multihead_attention.cu -o build/temp.linux-x86_64-cpython-310/csrc/attention/decoder_masked_multihead_attention.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++17 -DENABLE_BF16 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --threads=8 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=awq_inference_engine -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75
csrc/attention/decoder_masked_multihead_attention_template.hpp(989): warning #177-D: variable "v_offset" was declared but never referenced
      int v_offset = k_offset;
          ^
          detected during:
            instantiation of "void mmha_launch_kernel<T,Dh,Dh_MAX,KERNEL_PARAMS_TYPE>(const KERNEL_PARAMS_TYPE &, const cudaStream_t &) [with T=float, Dh=32, Dh_MAX=32, KERNEL_PARAMS_TYPE=Multihead_attention_params<float, false>]" at line 70 of csrc/attention/decoder_masked_multihead_attention.cu
            instantiation of "void multihead_attention_<T,KERNEL_PARAMS_TYPE>(const KERNEL_PARAMS_TYPE &, const cudaStream_t &) [with T=float, KERNEL_PARAMS_TYPE=Multihead_attention_params<float, false>]" at line 111 of csrc/attention/decoder_masked_multihead_attention.cu

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

csrc/attention/decoder_masked_multihead_attention_template.hpp(995): warning #177-D: variable "v_bias_offset" was declared but never referenced
      int v_bias_offset = k_bias_offset;
          ^
          detected during:
            instantiation of "void mmha_launch_kernel<T,Dh,Dh_MAX,KERNEL_PARAMS_TYPE>(const KERNEL_PARAMS_TYPE &, const cudaStream_t &) [with T=float, Dh=32, Dh_MAX=32, KERNEL_PARAMS_TYPE=Multihead_attention_params<float, false>]" at line 70 of csrc/attention/decoder_masked_multihead_attention.cu
            instantiation of "void multihead_attention_<T,KERNEL_PARAMS_TYPE>(const KERNEL_PARAMS_TYPE &, const cudaStream_t &) [with T=float, KERNEL_PARAMS_TYPE=Multihead_attention_params<float, false>]" at line 111 of csrc/attention/decoder_masked_multihead_attention.cu

gcc -pthread -B /home/llq/miniconda3/envs/awq/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/llq/miniconda3/envs/awq/include -fPIC -O2 -isystem /home/llq/miniconda3/envs/awq/include -fPIC -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/TH -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/THC -I/home/llq/cuda-12.1/include -I/home/llq/miniconda3/envs/awq/include/python3.10 -c csrc/attention/ft_attention.cpp -o build/temp.linux-x86_64-cpython-310/csrc/attention/ft_attention.o -g -O3 -fopenmp -lgomp -std=c++17 -DENABLE_BF16 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=awq_inference_engine -D_GLIBCXX_USE_CXX11_ABI=0
csrc/attention/ft_attention.cpp: In instantiation of ‘void set_params(Masked_multihead_attention_params<T>&, size_t, size_t, size_t, size_t, size_t, int, int, float, bool, int, T*, T*, T*, T*, T*, int*, float*, T*) [with T = short unsigned int; Masked_multihead_attention_params<T> = Multihead_attention_params<short unsigned int, false>; size_t = long unsigned int]’:
csrc/attention/ft_attention.cpp:163:5:   required from here
csrc/attention/ft_attention.cpp:72:11: warning: ‘void* memset(void*, int, size_t)’ clearing an object of non-trivial type ‘Masked_multihead_attention_params<short unsigned int>’ {aka ‘struct Multihead_attention_params<short unsigned int, false>’}; use assignment or value-initialization instead [-Wclass-memaccess]
   72 |     memset(&params, 0, sizeof(params));
      |     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from csrc/attention/ft_attention.cpp:8:
csrc/attention/decoder_masked_multihead_attention.h:121:8: note: ‘Masked_multihead_attention_params<short unsigned int>’ {aka ‘struct Multihead_attention_params<short unsigned int, false>’} declared here
  121 | struct Multihead_attention_params: public Multihead_attention_params_base<T> {
      |        ^~~~~~~~~~~~~~~~~~~~~~~~~~
csrc/attention/ft_attention.cpp: In instantiation of ‘void set_params(Masked_multihead_attention_params<T>&, size_t, size_t, size_t, size_t, size_t, int, int, float, bool, int, T*, T*, T*, T*, T*, int*, float*, T*) [with T = __nv_bfloat16; Masked_multihead_attention_params<T> = Multihead_attention_params<__nv_bfloat16, false>; size_t = long unsigned int]’:
csrc/attention/ft_attention.cpp:163:5:   required from here
csrc/attention/ft_attention.cpp:72:11: warning: ‘void* memset(void*, int, size_t)’ clearing an object of non-trivial type ‘Masked_multihead_attention_params<__nv_bfloat16>’ {aka ‘struct Multihead_attention_params<__nv_bfloat16, false>’}; use assignment or value-initialization instead [-Wclass-memaccess]
   72 |     memset(&params, 0, sizeof(params));
      |     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from csrc/attention/ft_attention.cpp:8:
csrc/attention/decoder_masked_multihead_attention.h:121:8: note: ‘Masked_multihead_attention_params<__nv_bfloat16>’ {aka ‘struct Multihead_attention_params<__nv_bfloat16, false>’} declared here
  121 | struct Multihead_attention_params: public Multihead_attention_params_base<T> {
      |        ^~~~~~~~~~~~~~~~~~~~~~~~~~
csrc/attention/ft_attention.cpp: In instantiation of ‘void set_params(Masked_multihead_attention_params<T>&, size_t, size_t, size_t, size_t, size_t, int, int, float, bool, int, T*, T*, T*, T*, T*, int*, float*, T*) [with T = float; Masked_multihead_attention_params<T> = Multihead_attention_params<float, false>; size_t = long unsigned int]’:
csrc/attention/ft_attention.cpp:163:5:   required from here
csrc/attention/ft_attention.cpp:72:11: warning: ‘void* memset(void*, int, size_t)’ clearing an object of non-trivial type ‘Masked_multihead_attention_params<float>’ {aka ‘struct Multihead_attention_params<float, false>’}; use assignment or value-initialization instead [-Wclass-memaccess]
   72 |     memset(&params, 0, sizeof(params));
      |     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from csrc/attention/ft_attention.cpp:8:
csrc/attention/decoder_masked_multihead_attention.h:121:8: note: ‘Masked_multihead_attention_params<float>’ {aka ‘struct Multihead_attention_params<float, false>’} declared here
  121 | struct Multihead_attention_params: public Multihead_attention_params_base<T> {
      |        ^~~~~~~~~~~~~~~~~~~~~~~~~~
/home/llq/cuda-12.1/bin/nvcc -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/TH -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/THC -I/home/llq/cuda-12.1/include -I/home/llq/miniconda3/envs/awq/include/python3.10 -c csrc/layernorm/layernorm.cu -o build/temp.linux-x86_64-cpython-310/csrc/layernorm/layernorm.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++17 -DENABLE_BF16 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --threads=8 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=awq_inference_engine -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75
/home/llq/cuda-12.1/bin/nvcc -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/TH -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/THC -I/home/llq/cuda-12.1/include -I/home/llq/miniconda3/envs/awq/include/python3.10 -c csrc/position_embedding/pos_encoding_kernels.cu -o build/temp.linux-x86_64-cpython-310/csrc/position_embedding/pos_encoding_kernels.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++17 -DENABLE_BF16 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --threads=8 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=awq_inference_engine -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75
gcc -pthread -B /home/llq/miniconda3/envs/awq/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/llq/miniconda3/envs/awq/include -fPIC -O2 -isystem /home/llq/miniconda3/envs/awq/include -fPIC -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/TH -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/THC -I/home/llq/cuda-12.1/include -I/home/llq/miniconda3/envs/awq/include/python3.10 -c csrc/pybind.cpp -o build/temp.linux-x86_64-cpython-310/csrc/pybind.o -g -O3 -fopenmp -lgomp -std=c++17 -DENABLE_BF16 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=awq_inference_engine -D_GLIBCXX_USE_CXX11_ABI=0
/home/llq/cuda-12.1/bin/nvcc -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/TH -I/home/llq/miniconda3/envs/awq/lib/python3.10/site-packages/torch/include/THC -I/home/llq/cuda-12.1/include -I/home/llq/miniconda3/envs/awq/include/python3.10 -c csrc/quantization/gemm_cuda_gen.cu -o build/temp.linux-x86_64-cpython-310/csrc/quantization/gemm_cuda_gen.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++17 -DENABLE_BF16 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --threads=8 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=awq_inference_engine -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75
csrc/quantization/gemm_cuda_gen.cu(34): warning #177-D: variable "ZERO" was declared but never referenced
    static constexpr uint32_t ZERO = 0x0;
                              ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

csrc/quantization/gemm_cuda_gen.cu(44): warning #177-D: variable "blockIdx_x" was declared but never referenced
    int blockIdx_x = 0;
        ^

csrc/quantization/gemm_cuda_gen.cu(65): warning #177-D: variable "ld_zero_flag" was declared but never referenced
    bool ld_zero_flag = (threadIdx.y * 32 + threadIdx.x) * 8 < 64;
         ^

csrc/quantization/gemm_cuda_gen.cu(21): warning #177-D: function "__pack_half2" was declared but never referenced
  __pack_half2(const half x, const half y) {
  ^

ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 709; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 713; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 717; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 721; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 725; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 729; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 733; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 737; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 741; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 745; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 749; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 753; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 757; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 761; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 765; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 769; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 821; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 825; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 829; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 833; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 837; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 841; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 845; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 849; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 853; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 857; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 861; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 865; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 869; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 873; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 877; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 881; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2185; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2189; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2193; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2197; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2201; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2205; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2209; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2213; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2217; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2221; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2225; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2229; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2233; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2237; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2241; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2245; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2297; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2301; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2305; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2309; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2313; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2317; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2321; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2325; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2329; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2333; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2337; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2341; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2345; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2349; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2353; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0010af6b_00000000-6_gemm_cuda_gen.ptx, line 2357; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas fatal   : Ptx assembly aborted due to errors
error: command '/home/llq/cuda-12.1/bin/nvcc' failed with exit code 255

Any guidance on how to address this issue would be greatly appreciated.

The text was updated successfully, but these errors were encountered:

1SingleFeng · 2024-05-22T06:08:48Z

Hello, from my understanding, this project requires a GPU Compute Capacity higher than 8.0, but the RTX 2080Ti is only 7.5 (find on https://developer.nvidia.com/cuda-gpus). You can use AutoAWQ (https://github.com/casper-hansen/AutoAWQ), which supports GPUs with a computing capacity of 7.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry about Minimum GPU Requirements #166

Inquiry about Minimum GPU Requirements #166

loiqy commented Mar 25, 2024

1SingleFeng commented May 22, 2024

Inquiry about Minimum GPU Requirements #166

Inquiry about Minimum GPU Requirements #166

Comments

loiqy commented Mar 25, 2024

My Env

nvcc -V

Output

1SingleFeng commented May 22, 2024

`nvcc -V`