Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use the DirectML packages to run on CPU in Windows App #430

Closed
natke opened this issue May 10, 2024 Discussed in #425 · 6 comments
Closed

Cannot use the DirectML packages to run on CPU in Windows App #430

natke opened this issue May 10, 2024 Discussed in #425 · 6 comments
Assignees

Comments

@natke
Copy link
Contributor

natke commented May 10, 2024

When I try to run the Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32-acc-level-4 model with the DirectML package I get this error in generator.ComputeLogits().
Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Non-zero status code returned while running Expand node. Name:'/model/attn_mask_reformat/input_ids_subgraph/Expand' Status Message: invalid expand shape'

Discussed in #425

Originally posted by AshD May 9, 2024
Background: Fusion Quill is a Windows AI Word processor and Chat app on the Microsoft Store. It currently uses llama.cpp to support multiple AI models and switches between using CUDA, ROC and CPU llama.cpp dlls depending on what the end user's PC capabilities.

How do I switch between using DirectML and CPU GenAI packages at runtime. If the user has a GPU, I want to use the Microsoft.ML.OnnxRuntimeGenAI.DirectML package with the corresponding DirectML model and if the user does not have a GPU, I want to the the Microsoft.ML.OnnxRuntimeGenAI package with the CPU version of the model.

Thanks,
Ash

@AshD
Copy link

AshD commented May 14, 2024

Any update on this? Thanks.

@natke
Copy link
Contributor Author

natke commented May 14, 2024

Hi @AshD, I just verified that I could run the CPU version of Phi-3 with the DirectML NuGet package. This was with 0.2.0-rc7, which is hot off the press. Do you want to confirm that this works for you too?

https://www.nuget.org/packages/Microsoft.ML.OnnxRuntimeGenAI.DirectML/0.2.0-rc7

@AshD
Copy link

AshD commented May 14, 2024

Hi @natke I am getting the same error after upgrading to rc7. Using the Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32-acc-level-4 model

Is there some settings that need to be set? Thanks.

@natke
Copy link
Contributor Author

natke commented May 14, 2024

This is really weird. I just downloaded that model and ran with rc7 nuget and it works.

Can you list your package dependencies here please?

@AshD
Copy link

AshD commented May 15, 2024

Tried different options and looks like this is the issue
generatorParams.SetSearchOption("past_present_share_buffer", false);

It has to be false for CPU and true for DML. And it works for both CPU and DML models :-)

We can close the issue. Is there an API to check if there is a DirectML device present?

@natke
Copy link
Contributor Author

natke commented May 21, 2024

hi @AshD, closing this one and opened a new one about the API #488

@natke natke closed this as completed May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants