Qwen2 GPTQ break in cpp_model.Model.np_bestla_qpack #163

yuchen2580 · 2024-03-11T03:35:22Z

Hi,
The model I used is Qwen1.5-0.5B-Chat-GPTQ-int4 from huggingface.
After debugging, it seems the model cannot be converted correctly by:
cpp_model.Model.np_bestla_qpack(

it breaks here without error or message shown.
The program can still continue to run though. And eventually it will show error when trying generation for the first time:
error loading model: model.cpp: tensor 'model.layers.0.self_attn.q_proj.weight' is missing from model

Zhenzhong1 · 2024-03-11T07:18:31Z

@yuchen2580 Hi, thanks for your issue.

I noticed this problem for Qwen1.5-0.5B-Chat-GPTQ-Int4.

I think this GPTQ model has some problems probably. It should be from https://hf-mirror.com/Qwen/Qwen1.5-0.5B.

In Qwen1.5-0.5B-Chat-GPTQ-Int4, there is no lm_head.weight in the model.safetensor. But original Qwen1.5-0.5B, it has this weight.

if you use these commands to check them, you will see this weird problem.

from safetensors.torch import load_file
tensors = load_file("model.safetensors")
tensors.keys()

Qwen1.5-0.5B-Chat-GPTQ-Int4:

Qwen1.5-0.5B:

Qwen1.5-0.5B-Chat-GPTQ-Int4 also doesn't work on the HF, which means I suspect there is something wrong with this model.

Another way, you can try https://huggingface.co/Qwen/Qwen1.5-7B-Chat-GPTQ-Int4. it works.

Zhenzhong1 self-assigned this Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2 GPTQ break in cpp_model.Model.np_bestla_qpack #163

Qwen2 GPTQ break in cpp_model.Model.np_bestla_qpack #163

yuchen2580 commented Mar 11, 2024

Zhenzhong1 commented Mar 11, 2024 •

edited

Qwen2 GPTQ break in cpp_model.Model.np_bestla_qpack #163

Qwen2 GPTQ break in cpp_model.Model.np_bestla_qpack #163

Comments

yuchen2580 commented Mar 11, 2024

Zhenzhong1 commented Mar 11, 2024 • edited

Zhenzhong1 commented Mar 11, 2024 •

edited