You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The model server timeout ("used for model server's backend workers before they are deemed unresponsive and rebooted") currently set in with env vars using SAGEMAKER_MODEL_SERVER_TIMEOUT is listed in seconds in the property method description docstring...
defmodel_server_timeout(self): # type: () -> int"""int: Timeout, in **seconds**, used for model server's backend workers before they are deemed unresponsive and rebooted. """returnself._model_server_timeout
...but the actual unit used downstream in multi-model server worker manager is minutes, not seconds.
// TODO: Change this to configurable paramModelWorkerResponsereply = replies.poll(responseTimeout, TimeUnit.MINUTES);
Because of this, the default timeout of 20 in inference toolkit is actually a 20 minute timeout, not a 20 second timeout.
It seems odd that the unit is minutes, and because this is a parsed as an int in inference-toolkit argparse it does only give a resolution of whole minutes (instead of say, .33 minutes for a 20s equivalent timeout), so should I report this downstream in multi-model-server? If you don't want to change it, we should at least fix the docstring in inference-toolkit.
The text was updated successfully, but these errors were encountered:
Describe the bug
The model server timeout ("used for model server's backend workers before they are deemed unresponsive and rebooted") currently set in with env vars using
SAGEMAKER_MODEL_SERVER_TIMEOUT
is listed in seconds in the property method description docstring......but the actual unit used downstream in multi-model server worker manager is minutes, not seconds.
Because of this, the default timeout of 20 in inference toolkit is actually a 20 minute timeout, not a 20 second timeout.
It seems odd that the unit is minutes, and because this is a parsed as an int in inference-toolkit argparse it does only give a resolution of whole minutes (instead of say, .33 minutes for a 20s equivalent timeout), so should I report this downstream in multi-model-server? If you don't want to change it, we should at least fix the docstring in inference-toolkit.
The text was updated successfully, but these errors were encountered: