-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Made requested change by GKE to meet expectation #5535
Conversation
@@ -89,7 +89,7 @@ properties: | |||
modelRepositoryPath: | |||
type: string | |||
title: Bucket where models are stored. Please make sure the user/service account to create the GKE app has permission to this GCS bucket. Read Triton documentation on configs and formatting details, supporting TensorRT, TensorFlow, Pytorch, Onnx ... etc. | |||
default: gs://triton_sample_models/models | |||
default: gs://triton_sample_models/23_02 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the models change with every release? Also, is the GCS bucket going to be created for every release before we publish on NGC?
CC @mc-nv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- it's a template, value must be populated by user.
- In clean scenario yes, we create new storage bucket, but that's depend on user needs and particular cluster setup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be able to push new model for every release, I can talk to Christina to have it set up.
rebased
…On Wed, Mar 22, 2023 at 10:43 AM Misha Chornyi ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In deploy/gke-marketplace-app/server-deployer/schema.yaml
<#5535 (comment)>
:
> @@ -89,7 +89,7 @@ properties:
modelRepositoryPath:
type: string
title: Bucket where models are stored. Please make sure the user/service account to create the GKE app has permission to this GCS bucket. Read Triton documentation on configs and formatting details, supporting TensorRT, TensorFlow, Pytorch, Onnx ... etc.
- default: gs://triton_sample_models/models
+ default: gs://triton_sample_models/23_02
I'm afraid it's outdated PR - need to be rebased
—
Reply to this email directly, view it on GitHub
<#5535 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLQN46CTAPXCXSGS6KG5ODW5M22LANCNFSM6AAAAAAWDLWCSY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@nv-kmcgill53 @rmccorm4 any updates? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice changes, mostly just typos/formatting suggestions
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
thanks for the changes @rmccorm4 |
@mengdong LGTM, just need to fix the merge conflicts |
@rmccorm4 just resolved the merge conflict, thanks! |
Lastly @mengdong, is there any reason why we use a TensorRT model for this example? I feel like it adds unnecessary complexity for the user. i.e. if we used an ONNX model or something more portable, the public demo bucket wouldn't have to be updated every release. This would also simplify our own upkeep and making sure the example works more broadly. What do you think? And if TRT is an important part to demo, we could just keep the documentation on how users would create their own TRT model from the ONNX model for reference. |
Thanks @rmccorm4, I chose TRT mostly out of performance consideration and TRT provide best performance with Triton. We do provide a section (README) on how to convert the onnx model to TRT. For our upkeep, it should very easy process (could be automated) following the README to update the TRT model in the bucket. |
The extra restriction for only working on T4 machines can be cumbersome as well, but if no one's complained so far then I guess there's no reason to change. |
any updates... ? |
any more updates... ? |
1: remove use of istio and use simple ingress solution
2, update hpa version
3, usage of service account.