Skip to content

[Question] What is the recommended way to run Triton ? #5981

Discussion options

You must be logged in to vote

Does each instance of InferenceServerClient reuses Stubs / channels ?

For Python gRPC, each instance of InferenceServerClient creates a new channel. It does not reuse the channel.

I am serving multiple models per Triton container, does each instance of InferenceServerClient have a unique stub / channel / connection per model or does it create a new stub / channel / connection per model at each request.

There is a single channel connection per InferenceServerClient. The same channel is used for all requests (all model infer requests and also non-infer requests are pushed through the same channel)

We don't have a specific recommendation to follow. However, our C++ client library is more…

Replies: 2 comments 8 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
8 replies
@MatthieuToulemont
Comment options

@sourabh-burnwal
Comment options

@MatthieuToulemont
Comment options

@tanmayv25
Comment options

Answer selected by MatthieuToulemont
@MatthieuToulemont
Comment options

@ken-kuro
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
question Further information is requested
5 participants
Converted from issue

This discussion was converted from issue #5978 on June 23, 2023 01:50.