Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does opt_13b model support tensor parallelism vias inferflow? #49

Open
LHQUer opened this issue Mar 18, 2024 · 2 comments
Open

Does opt_13b model support tensor parallelism vias inferflow? #49

LHQUer opened this issue Mar 18, 2024 · 2 comments

Comments

@LHQUer
Copy link

LHQUer commented Mar 18, 2024

The settings are as followed:

devices = 0&1&2&3;4&5&6&7
decoder_cpu_layer_count = 0
cpu_threads = 8

max_concurrent_queries = 6

return_output_tensors = true

;debug options
is_study_mode = false
show_tensors = false

When I run the opt_13b by inferflow, the error is as followed:

Configuration = release; Platform = x64
========== ========== ========== ========== ========== ==========
Loading model specifications...
Loading model opt_13b...
vocab_size: 50272, embd_dims: 5120, decoder layers: 40, decoder heads: 40, decoder kv heads: 40
qkv_format 1 is not compatible with tensor parallelism
Failed to load the model
Failed to initialize the inference engine
Memory usage (MB): 203.21, 203.21 (Peak)
Press the enter key to quit...

@shumingshi
Copy link
Collaborator

Thank you for raising this issue. Tensor parallelsim is not supported so far for this model. We will try to fix this issue. As a mitigation, you can either apply pipeline parallelsim or serve the model on one GPU device (together with quantization if the VRAM of each device is less than 32GB).

@LHQUer
Copy link
Author

LHQUer commented Mar 19, 2024

Thank you for raising this issue. Tensor parallelsim is not supported so far for this model. We will try to fix this issue. As a mitigation, you can either apply pipeline parallelsim or serve the model on one GPU device (together with quantization if the VRAM of each device is less than 32GB).

Thanks for your general answer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants