You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently facing an issue with specifying the TensorRT version in Triton Server. I have exported my models as .plan files using TensorRT 10.0 because using version 8.6.1 resulted in unsupported INT64 operations, which led to significant precision loss. Additionally, there were errors related to batch processing when exporting the bce-rerank model using TensorRT 8.6.1. After consulting the documentation and doing some research, it seems that TensorRT 10.0 resolves these issues.
However, the latest NGC container for Triton Server only includes TensorRT 8.6.3, which fails to load my model. I attempted the following methods to upgrade the TensorRT version:
1、Pulled the full Triton Server 24.03 container and upgraded to TensorRT 10 within the container, but it still attempts to load version 8.6.3, which led me to believe that a backend change is necessary. Hence, I tried the next step.
2、Pulled the TensorRT backend from this GitHub repository and attempted to compile it with TensorRT 10, but encountered errors that seem to also indicate a version mismatch.
Question:
How can I resolve this issue to use TensorRT 10 for inference in Triton Server? Any advice or insights on how to successfully deploy and run inference with the latest version of TensorRT in Triton Server would be greatly appreciated!
Triton Information
Triton Server 24.03 container
Expected behavior
A clear and concise description of what you expected to happen.
The text was updated successfully, but these errors were encountered:
I've read through some issues where adjustments were made to the Triton Server containers by selecting appropriate versions. I am wondering if it is possible to upgrade only the TensorRT version within the current container, or should I wait for an official NGC container that includes TensorRT 10?
Hi @Gcstk, thanks for bringing this up. There will be some API changes and fixes needed if you'd like to compile the TRT backend with TRT 10. I'd recommend waiting until we officially support TRT 10, which will happen with Triton 24.05. Note that the integration is still in progress and not all features would be supported as of 24.05.
Description:
I am currently facing an issue with specifying the TensorRT version in Triton Server. I have exported my models as .plan files using TensorRT 10.0 because using version 8.6.1 resulted in unsupported INT64 operations, which led to significant precision loss. Additionally, there were errors related to batch processing when exporting the bce-rerank model using TensorRT 8.6.1. After consulting the documentation and doing some research, it seems that TensorRT 10.0 resolves these issues.
However, the latest NGC container for Triton Server only includes TensorRT 8.6.3, which fails to load my model. I attempted the following methods to upgrade the TensorRT version:
1、Pulled the full Triton Server 24.03 container and upgraded to TensorRT 10 within the container, but it still attempts to load version 8.6.3, which led me to believe that a backend change is necessary. Hence, I tried the next step.
2、Pulled the TensorRT backend from this GitHub repository and attempted to compile it with TensorRT 10, but encountered errors that seem to also indicate a version mismatch.
Question:
How can I resolve this issue to use TensorRT 10 for inference in Triton Server? Any advice or insights on how to successfully deploy and run inference with the latest version of TensorRT in Triton Server would be greatly appreciated!
Triton Information
Triton Server 24.03 container
Expected behavior
A clear and concise description of what you expected to happen.
The text was updated successfully, but these errors were encountered: