Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made requested change by GKE to meet expectation #5535

Merged
merged 16 commits into from
Apr 6, 2023
Merged

Made requested change by GKE to meet expectation #5535

merged 16 commits into from
Apr 6, 2023

Conversation

mengdong
Copy link
Contributor

1: remove use of istio and use simple ingress solution
2, update hpa version
3, usage of service account.

@@ -89,7 +89,7 @@ properties:
modelRepositoryPath:
type: string
title: Bucket where models are stored. Please make sure the user/service account to create the GKE app has permission to this GCS bucket. Read Triton documentation on configs and formatting details, supporting TensorRT, TensorFlow, Pytorch, Onnx ... etc.
default: gs://triton_sample_models/models
default: gs://triton_sample_models/23_02
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the models change with every release? Also, is the GCS bucket going to be created for every release before we publish on NGC?

CC @mc-nv

Copy link
Collaborator

@mc-nv mc-nv Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. it's a template, value must be populated by user.
  2. In clean scenario yes, we create new storage bucket, but that's depend on user needs and particular cluster setup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to push new model for every release, I can talk to Christina to have it set up.

@mengdong
Copy link
Contributor Author

mengdong commented Mar 23, 2023 via email

@mengdong
Copy link
Contributor Author

@nv-kmcgill53 @rmccorm4 any updates? Thanks!

Copy link
Collaborator

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes, mostly just typos/formatting suggestions

deploy/gke-marketplace-app/README.md Outdated Show resolved Hide resolved
deploy/gke-marketplace-app/README.md Outdated Show resolved Hide resolved
deploy/gke-marketplace-app/README.md Outdated Show resolved Hide resolved
deploy/gke-marketplace-app/README.md Outdated Show resolved Hide resolved
deploy/gke-marketplace-app/README.md Outdated Show resolved Hide resolved
deploy/gke-marketplace-app/README.md Outdated Show resolved Hide resolved
deploy/gke-marketplace-app/trt-engine/README.md Outdated Show resolved Hide resolved
mengdong and others added 7 commits March 30, 2023 09:57
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
@mengdong
Copy link
Contributor Author

thanks for the changes @rmccorm4

@rmccorm4
Copy link
Collaborator

@mengdong LGTM, just need to fix the merge conflicts

@mengdong
Copy link
Contributor Author

@rmccorm4 just resolved the merge conflict, thanks!

@rmccorm4
Copy link
Collaborator

Lastly @mengdong, is there any reason why we use a TensorRT model for this example? I feel like it adds unnecessary complexity for the user. i.e. if we used an ONNX model or something more portable, the public demo bucket wouldn't have to be updated every release. This would also simplify our own upkeep and making sure the example works more broadly. What do you think?

And if TRT is an important part to demo, we could just keep the documentation on how users would create their own TRT model from the ONNX model for reference.

@mengdong
Copy link
Contributor Author

Thanks @rmccorm4, I chose TRT mostly out of performance consideration and TRT provide best performance with Triton.

We do provide a section (README) on how to convert the onnx model to TRT.

For our upkeep, it should very easy process (could be automated) following the README to update the TRT model in the bucket.

@rmccorm4
Copy link
Collaborator

For our upkeep, it should very easy process (could be automated) following the README to update the TRT model in the bucket.

The extra restriction for only working on T4 machines can be cumbersome as well, but if no one's complained so far then I guess there's no reason to change.

rmccorm4
rmccorm4 previously approved these changes Mar 30, 2023
@mengdong
Copy link
Contributor Author

mengdong commented Apr 4, 2023

any updates... ?

@mengdong
Copy link
Contributor Author

mengdong commented Apr 6, 2023

any more updates... ?

@nv-kmcgill53 nv-kmcgill53 merged commit fd8d3fa into triton-inference-server:main Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants