-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot use the DirectML packages to run on CPU in Windows App #430
Comments
Any update on this? Thanks. |
Hi @AshD, I just verified that I could run the CPU version of Phi-3 with the DirectML NuGet package. This was with 0.2.0-rc7, which is hot off the press. Do you want to confirm that this works for you too? https://www.nuget.org/packages/Microsoft.ML.OnnxRuntimeGenAI.DirectML/0.2.0-rc7 |
Hi @natke I am getting the same error after upgrading to rc7. Using the Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32-acc-level-4 model Is there some settings that need to be set? Thanks. |
This is really weird. I just downloaded that model and ran with rc7 nuget and it works. Can you list your package dependencies here please? |
Tried different options and looks like this is the issue It has to be false for CPU and true for DML. And it works for both CPU and DML models :-) We can close the issue. Is there an API to check if there is a DirectML device present? |
When I try to run the Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32-acc-level-4 model with the DirectML package I get this error in generator.ComputeLogits().
Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Non-zero status code returned while running Expand node. Name:'/model/attn_mask_reformat/input_ids_subgraph/Expand' Status Message: invalid expand shape'
Discussed in #425
Originally posted by AshD May 9, 2024
Background: Fusion Quill is a Windows AI Word processor and Chat app on the Microsoft Store. It currently uses llama.cpp to support multiple AI models and switches between using CUDA, ROC and CPU llama.cpp dlls depending on what the end user's PC capabilities.
How do I switch between using DirectML and CPU GenAI packages at runtime. If the user has a GPU, I want to use the Microsoft.ML.OnnxRuntimeGenAI.DirectML package with the corresponding DirectML model and if the user does not have a GPU, I want to the the Microsoft.ML.OnnxRuntimeGenAI package with the CPU version of the model.
Thanks,
Ash
The text was updated successfully, but these errors were encountered: