You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
I am using the Triton Inference Server with a TensorRT backend, Sequence Batching, Old Batching Strategy and Implicit State Management. I would like to find the most efficient method to update my inference model without causing downtime.
One approach involves requesting model states from the terminating old Triton instance (which is serving the old model) at the moment of model replacement. These states are then sent to other still operational Triton instances to handle client requests until the transition is complete. Implementing this approach requires accessing the current states stored in Triton.
Question
Is it possible to extract model states stored in Triton via its API?
The text was updated successfully, but these errors were encountered:
For debugging purposes, the client can request the output state. In order to allow the client to request the output state, the output section of the model configuration must list the output state as one of the model outputs. Note that requesting the output state from the client can increase the request latency because of the additional tensors that have to be transferred.
Description
I am using the Triton Inference Server with a TensorRT backend, Sequence Batching, Old Batching Strategy and Implicit State Management. I would like to find the most efficient method to update my inference model without causing downtime.
One approach involves requesting model states from the terminating old Triton instance (which is serving the old model) at the moment of model replacement. These states are then sent to other still operational Triton instances to handle client requests until the transition is complete. Implementing this approach requires accessing the current states stored in Triton.
Question
Is it possible to extract model states stored in Triton via its API?
The text was updated successfully, but these errors were encountered: