You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using Triton in production on H100 and I am running into some issues when certain requests trigger cuda errors. Those are usually breaking the GPU for the lifetime of the process. Usually restart the container solves this issue.
Hence, I was wondering if there was a way to restart the container, from within, if we detect those errors.
Best,
The text was updated successfully, but these errors were encountered:
If we allow for strict readiness and model control = none, would /v2/health/ready return False if the one of the models is unhealthy ? For instance when a python model stub is considered unhealthy
Thanks for the tremendous work here.
I am using Triton in production on H100 and I am running into some issues when certain requests trigger cuda errors. Those are usually breaking the GPU for the lifetime of the process. Usually restart the container solves this issue.
Hence, I was wondering if there was a way to restart the container, from within, if we detect those errors.
Best,
The text was updated successfully, but these errors were encountered: