Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: Fail to convert pytorch model #194

Open
anthony-intel opened this issue Mar 27, 2024 · 2 comments
Open

AssertionError: Fail to convert pytorch model #194

anthony-intel opened this issue Mar 27, 2024 · 2 comments
Assignees

Comments

@anthony-intel
Copy link

anthony-intel commented Mar 27, 2024

this is using the example code only

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
model_name = "Intel/neural-chat-7b-v3-1"     # Hugging Face model_id or local model
prompt = "Once upon a time, there existed a little girl,"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
streamer = TextStreamer(tokenizer)

model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
print(outputs)

yields

2024-03-27 02:12:43 [INFO] Using Neural Speed...
2024-03-27 02:12:43 [INFO] cpu device is used.
2024-03-27 02:12:43 [INFO] Applying Weight Only Quantization.
2024-03-27 02:12:43 [INFO] Using LLM runtime.
cmd: ['python', PosixPath('/usr/local/lib/python3.10/dist-packages/neural_speed/convert/convert_mistral.py'), '--outfile', 'runtime_outs/ne_mistral_f32.bin', '--outtype', 'f32', 'Intel/neural-chat-7b-v3-1']
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[<ipython-input-17-40dcb74a8701>](https://localhost:8080/#) in <cell line: 10>()
      8 streamer = TextStreamer(tokenizer)
      9 
---> 10 model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
     11 outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
     12 print(outputs)

1 frames
[/usr/local/lib/python3.10/dist-packages/neural_speed/__init__.py](https://localhost:8080/#) in init(self, model_name, use_quant, use_gptq, use_awq, use_autoround, weight_dtype, alg, group_size, scale_dtype, compute_dtype, use_ggml)
    129         if not os.path.exists(fp32_bin):
    130             convert_model(model_name, fp32_bin, "f32")
--> 131             assert os.path.exists(fp32_bin), "Fail to convert pytorch model"
    132 
    133         if not use_quant:

AssertionError: Fail to convert pytorch model

image

@zhentaoyu
Copy link
Contributor

Hi, this issue seems to have the same reason as #193. pip install neural_speed won't install all packages from requirements.txt and we are trying to fix it now. You can use pip install -r requirements.txt for a quick fix. Thanks.

@anthony-intel
Copy link
Author

@zhentaoyu thanks - looking forward to the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants