You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ran the Exported model using the ORT session :
model_args["use_cache"] = True
model_args["use_io_binding"] = True
model_args["trust_remote_code"] = False
model_args["provider"] = CPUExecutionProvider
model_args["session_options"] = sess_options
Getting the following sample output as the result to one of the prompts:
Sample_response
To reproduce
Export the OPT6.7B (facebook/opt-6.7b) model.
optimum-cli export onnx --model facebook/opt-6.7b opt6.7_fp32_onnx --no-post-process --trust-remote-code
Exported a OPT6.7B (facebook/opt-6.7b) model.
optimum-cli export onnx --model facebook/opt-6.7b opt6.7_fp32_onnx --no-post-process --trust-remote-code
Ran the Exported model using the ORT session :
model_args["use_cache"] = True
model_args["use_io_binding"] = True
model_args["trust_remote_code"] = False
model_args["provider"] = CPUExecutionProvider
model_args["session_options"] = sess_options
Getting the following sample output as the result to one of the prompts:
Sample_response
To reproduce
Export the OPT6.7B (facebook/opt-6.7b) model.
optimum-cli export onnx --model facebook/opt-6.7b opt6.7_fp32_onnx --no-post-process --trust-remote-code
Clone the onnxruntime repo (https://github.com/microsoft/onnxruntime/tree/v1.17.0)
Tokenizer = AutoTokenizer
CausalLMModel = ORTOPTForCausalLM
#Tokenizer
tokenizer = Tokenizer.from_pretrained(
model_id,
token=True,
cache_dir=cache_dir,
trust_remote_code=model_args["trust_remote_code"],
)
tokenizer.add_special_tokens({"pad_token": "[PAD]"})
Model
model = CausalLMModel.from_pretrained(onnx_model_dir, **model_args)
Urgency
Accuracy drop with FP32 model resulting in project delay
Platform
Windows
OS Version
Microsoft Windows 11 Pro Build 22631
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.17.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
The text was updated successfully, but these errors were encountered: