Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awq_inference_engine has no attribute 'gemm_forward_cuda_new' #153

Open
pribadihcr opened this issue Mar 6, 2024 · 4 comments
Open

awq_inference_engine has no attribute 'gemm_forward_cuda_new' #153

pribadihcr opened this issue Mar 6, 2024 · 4 comments

Comments

@pribadihcr
Copy link

Hi, when running the vlm demo

python vlm_demo.py --model-path ../cache/VILA-7b --quant-path ../cache/VILA-7b-4bit-awq/vila-7b-w4-g128-v2.pt --precision W4A16 --image-file sample.jpg

I got the following error:

AttributeError: module 'awq_inference_engine' has no attribute 'gemm_forward_cuda_new'. Did you mean: 'gemm_forward_cuda'?
@ys-2020
Copy link
Contributor

ys-2020 commented Mar 7, 2024

Hi, please re-install the new version of AWQ following the readme. The new backend kernels can be installed via python setup.py install in awq/kernels.

@pribadihcr
Copy link
Author

Hi, please re-install the new version of AWQ following the readme. The new backend kernels can be installed via python setup.py install in awq/kernels.

Thanks. I have installed. Got new error

python vlm_demo.py --model-path ../cache/VILA-7b --quant-path ../cache/VILA-7b-4bit-awq/vila-7b-w4-g128-v2.pt --precision W4A16 --image-file sample.jpg
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

@ys-2020
Copy link
Contributor

ys-2020 commented Mar 8, 2024

Looks like you have built and installed the awq_inference_engine, but not the correct kernel image for your device.

This might be caused by building kernel images upon previous cached image. Please try to remove the build folder in awq/kernels and try python setup.py install again.

@pribadihcr
Copy link
Author

Hi, suddently, In the middle of build the kernel I got the following message:

ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 144; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 159; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 186; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 211; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 218; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 227; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 241; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 255; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 278; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 285; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 294; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 303; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 312; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 335; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 342; error   : Feature 'cp.async' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 868; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_00005e08_00000000-6_gemm_cuda.ptx, line 871; error   : Feature '.m16n8k16' requires .target sm_80 or higher
....

my GPU is 2070

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants