Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extract model states stored in Triton (Implicit State Management) #7119

Open
chuikova-e opened this issue Apr 15, 2024 · 3 comments
Open
Labels
question Further information is requested

Comments

@chuikova-e
Copy link

Description
I am using the Triton Inference Server with a TensorRT backend, Sequence Batching, Old Batching Strategy and Implicit State Management. I would like to find the most efficient method to update my inference model without causing downtime.

One approach involves requesting model states from the terminating old Triton instance (which is serving the old model) at the moment of model replacement. These states are then sent to other still operational Triton instances to handle client requests until the transition is complete. Implementing this approach requires accessing the current states stored in Triton.

Question
Is it possible to extract model states stored in Triton via its API?

@Tabrizian Tabrizian assigned Tabrizian and unassigned Tabrizian Apr 19, 2024
@Tabrizian
Copy link
Member

You can add another output with the same name as the output state if you want to return it to the client. https://github.com/triton-inference-server/server/blob/main/docs/user_guide/architecture.md#implicit-state-management

For debugging purposes, the client can request the output state. In order to allow the client to request the output state, the output section of the model configuration must list the output state as one of the model outputs. Note that requesting the output state from the client can increase the request latency because of the additional tensors that have to be transferred.

@Tabrizian Tabrizian added the question Further information is requested label Apr 19, 2024
@chuikova-e
Copy link
Author

Is it the only way to extract states?

@Tabrizian
Copy link
Member

Currently, yes that's the only way. Let us know if you have ideas for other ways to extract states as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants