Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected reshaping of output #7189

Open
lemousehunter opened this issue May 7, 2024 · 1 comment
Open

Unexpected reshaping of output #7189

lemousehunter opened this issue May 7, 2024 · 1 comment

Comments

@lemousehunter
Copy link

Description
I have specified [-1, 1024] as the output dimensions for my ensemble model, but the output is still reshaped to [1024].

Triton Information
NVIDIA Release 24.03 (build 86102629)
Triton Server Version 2.44.0

Are you using the Triton container or did you build it yourself?
I am using the NGC Triton Container

To Reproduce

  1. Do not use dynamic batching for Ensemble model
  2. Use dynamic batching for the last model before output
  3. Set output dims of final model in ensemble to be [1024]
  4. Set output dims of the Ensemble model to be [-1, 1024]

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

Backend of last model in ensemble: ONNX Runtime

Expected behavior
Expected no reshaping of output since batched output of last model in ensemble has the same dimensions of the specified output of the ensemble model.
bge-m3_config.zip

@lemousehunter
Copy link
Author

For more context: I am trying to replicate the multi-text embedding generation in a single request. The output of the BGE-m3 is (2, 1024) for a Text input of (2, ). However, the Ensemble model still returns an output of (2048, ) instead (the bge-m3 output is flattened by the forced reshaping).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant