Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference]: NVIDIA Triton Server with TensortRT-LLM pattern #508

Open
vara-bonthu opened this issue Apr 25, 2024 · 0 comments
Open

[Inference]: NVIDIA Triton Server with TensortRT-LLM pattern #508

vara-bonthu opened this issue Apr 25, 2024 · 0 comments
Labels
gen-ai pattern Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs)

Comments

@vara-bonthu
Copy link
Contributor

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

  • Create a new pattern to use NVIDIA Triton server with TensorRT
  • Showcase any LLM model

Describe the solution you would like

Describe alternatives you have considered

Additional context

@vara-bonthu vara-bonthu added the gen-ai pattern Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs) label Apr 25, 2024
@vara-bonthu vara-bonthu changed the title [Inference]: NVIDIA Triton Server with TRT pattern [Inference]: NVIDIA Triton Server with TensortRT-LLM pattern May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gen-ai pattern Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs)
Projects
None yet
Development

No branches or pull requests

1 participant