Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trt_profile_max_shapes not supported for ONNX-TRT backend #7156

Closed
ShuaiShao93 opened this issue Apr 24, 2024 · 1 comment
Closed

trt_profile_max_shapes not supported for ONNX-TRT backend #7156

ShuaiShao93 opened this issue Apr 24, 2024 · 1 comment

Comments

@ShuaiShao93
Copy link

ShuaiShao93 commented Apr 24, 2024

Description
trt_profile_max_shapes is documented here, but it doesn't work

Triton Information
24.03

To Reproduce
Add this to onnx model config

optimization {
  execution_accelerators {
    gpu_execution_accelerator {
      name: "tensorrt"
      parameters {
        key: "max_workspace_size_bytes"
        value: "16000000000"
      }
      parameters {
        key: "precision_mode"
        value: "FP16"
      }
      parameters {
        key: "trt_engine_cache_enable"
        value: "1"
      }
      parameters {
        key: "trt_profile_max_shapes"
        value: "input_ids:8x256,attention_mask:8x256"
      }
      parameters {
        key: "trt_profile_min_shapes"
        value: "input_ids:1x256,attention_mask:1x256"
      }
      parameters {
        key: "trt_profile_opt_shapes"
        value: "input_ids:8x256,attention_mask:8x256"
      }
    }
  }
}

Start the triton server, it fails with Invalid argument: unknown parameter 'trt_profile_max_shapes' is provided for TensorRT Execution Accelerator;

Expected behavior
It should be supported as documentation

@rmccorm4
Copy link
Collaborator

rmccorm4 commented May 1, 2024

Hi @ShuaiShao93,

I believe support for those parameters was just added recently here and should be included in the 24.04 release which was just released today. Please try it out and raise a new issue if that is not the case.

@rmccorm4 rmccorm4 closed this as completed May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants