You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It reports:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Token indices sequence length is longer than the specified maximum sequence length for this model (98937 > 16384). Running this sequence through the model will result in indexing errors
Killed
And then kills my process. How can I solve this problem if I use awq on such a big model?
The text was updated successfully, but these errors were encountered:
When I use awq official code to quantize Deepseek-coder-33B-instruct model, the scripts are as following:
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_path = '/hy-tmp/deepseek-coder-33b-instruct'
quant_path = '/hy-tmp/deepseek-coder-33b-instruct-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
model = AutoAWQForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map="auto",**{"low_cpu_mem_usage": True})
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model.quantize(tokenizer, quant_config=quant_config)
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)
It reports:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Token indices sequence length is longer than the specified maximum sequence length for this model (98937 > 16384). Running this sequence through the model will result in indexing errors
Killed
And then kills my process. How can I solve this problem if I use awq on such a big model?
The text was updated successfully, but these errors were encountered: