Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPT6.7b ONNX model not giving accurate results on CPU #1848

Open
pragyam32 opened this issue May 8, 2024 · 0 comments
Open

OPT6.7b ONNX model not giving accurate results on CPU #1848

pragyam32 opened this issue May 8, 2024 · 0 comments

Comments

@pragyam32
Copy link

Exported a OPT6.7B (facebook/opt-6.7b) model.
optimum-cli export onnx --model facebook/opt-6.7b opt6.7_fp32_onnx --no-post-process --trust-remote-code

Ran the Exported model using the ORT session :
model_args["use_cache"] = True
model_args["use_io_binding"] = True
model_args["trust_remote_code"] = False
model_args["provider"] = CPUExecutionProvider
model_args["session_options"] = sess_options

Getting the following sample output as the result to one of the prompts:

Sample_response
Sample_response

To reproduce
Export the OPT6.7B (facebook/opt-6.7b) model.
optimum-cli export onnx --model facebook/opt-6.7b opt6.7_fp32_onnx --no-post-process --trust-remote-code

Clone the onnxruntime repo (https://github.com/microsoft/onnxruntime/tree/v1.17.0)

Tokenizer = AutoTokenizer
CausalLMModel = ORTOPTForCausalLM

#Tokenizer
tokenizer = Tokenizer.from_pretrained(
model_id,
token=True,
cache_dir=cache_dir,
trust_remote_code=model_args["trust_remote_code"],
)
tokenizer.add_special_tokens({"pad_token": "[PAD]"})

Model
model = CausalLMModel.from_pretrained(onnx_model_dir, **model_args)

Urgency
Accuracy drop with FP32 model resulting in project delay

Platform
Windows

OS Version
Microsoft Windows 11 Pro Build 22631

ONNX Runtime Installation
Released Package

ONNX Runtime Version or Commit ID
1.17.0

ONNX Runtime API
Python

Architecture
X64

Execution Provider
Default CPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant