-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model for llama-3-8B, EP: cpu, precision: int4 generated using onnxruntime-genai/src/python/py/models/builder.py has issues #462
Comments
This is the same error as this issue. Can you try the steps specified there? Alternatively, you can upgrade to the latest RC version (0.2.0rc7) once it is released and try again. |
In the earlier case (issue #459), I have generated Gemma model using onnxruntime_genai.models.builder. Because that is giving error and you suggested doing onnxruntime-genai/src/python/py/models/builder.py, that fixed that issue, I have used your suggestion. This is for Llama. It is giving the error reported. |
The fix is the same as the linked issue. If you substitute Example:
With the version of ONNX Runtime GenAI that you have installed, this error will keep happening when using the "from wheel" command to generate any INT4 CPU model where |
@jmopuri Can you try with onnxruntime-genai version 0.2.0? |
I have used llama-3-8B on hugging-face model to generate ONNX model using builder.py in onnxruntime-genai. When I try use that model and run onnxruntime-genai\examples\python, I am getting this error:
onnxruntime_genai.onnxruntime_genai.OrtException: Load model from .\llama-3-8B_cpu_int4\model.onnx failed:Invalid model. Node input '/model/layers.0/attn/k_proj/repeat_kv/Transpose_2/output_0' is not a graph input, initializer, or output of a previous node.
The text was updated successfully, but these errors were encountered: