-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Build] Propagate build option for CUDA minimal to TRT #20695
base: main
Are you sure you want to change the base?
Conversation
Can you rebase it to main? |
6566d23
to
c6d7bb0
Compare
@chilo-ms Sure, done. |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, ONNX Runtime React Native CI Pipeline, Windows x64 QNN CI Pipeline |
/azp run Linux MIGraphX CI Pipeline, orttraining-amd-gpu-ci-pipeline |
Azure Pipelines successfully started running 2 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
1 similar comment
Azure Pipelines successfully started running 9 pipeline(s). |
file(GLOB_RECURSE onnxruntime_providers_cuda_cu_srcs CONFIGURE_DEPENDS | ||
"${ONNXRUNTIME_ROOT}/core/providers/cuda/*.cu" | ||
"${ONNXRUNTIME_ROOT}/core/providers/cuda/*.cuh" | ||
) | ||
else() | ||
set(onnxruntime_providers_cuda_cu_srcs | ||
"${ONNXRUNTIME_ROOT}/core/providers/cuda/math/unary_elementwise_ops_impl.cu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we need to include unary_elementwise_impl.cuh as well?
The definition of UnaryElementWiseImpl()
is in that file and used by cuda::Impl_Cast<SrcT, DstT>
which TRT EP will call to cast DOUBLE <- -> FLOAT or INT64 <-->INT32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The include should be added as the inlclude directory is set correctly. If you want I can add it to sources, but usually this is not the case in CMake I would say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah true, then no need to add it to sources
Description
Extend cuda minimal option to TRT provider, as with TRT 10 no linking to cuDNN is required anymore
.
Besides that with the new engine dump feature it is also possible to embed an engine in to an ONNX and not ship a builder lib.
In addition to that this has roughly the same deserialization time/session setup time that using TRT standalone has.
Motivation and Context