You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I find the current docker image xx.yy-py3 doesn't have commonly use data preprocessing libraries like huggingface transformers for accessing the tokenizer, for example. Missing this single missing package greatly limits our abilities to use triton-inference-server with its ensemble model feature.
In our specific usecase, pip install at runtime or using conda-pack are highly discouraged for various reasons. This is somewhat similar to #6467 and I believe might be common in many other industrial scenarios too.
Describe the solution you'd like
Given the prevalence of using triton server for NLP-related workload, would suggest including the transformers library in the pre-built docker image if possible.
Describe alternatives you've considered
There are other images like 24.03-trtllm-python-py3 that does come with transformers pre-installed, however we need to serve bert-like models and accordding to triton-inference-server/tensorrtllm_backend#368, there is no clear timeline to support this. So we have to rely on other backend (like ORT) to execute our model.
Additional context
Any thoughts / suggestions will be greatly appreciated!
The text was updated successfully, but these errors were encountered:
Unfortunately, we cannot install these libraries as it can increase the container size significantly and there are many other customers asking for different libraries to be included. If we accommodate all these requests, the container size would be much larger than it already is. Creating conda-pack environments or custom images are our only recommendation at this point. Let us know if you have any other suggestions that might help with this issue.
Is your feature request related to a problem? Please describe.
I find the current docker image
xx.yy-py3
doesn't have commonly use data preprocessing libraries like huggingface transformers for accessing the tokenizer, for example. Missing this single missing package greatly limits our abilities to use triton-inference-server with its ensemble model feature.In our specific usecase, pip install at runtime or using
conda-pack
are highly discouraged for various reasons. This is somewhat similar to #6467 and I believe might be common in many other industrial scenarios too.Describe the solution you'd like
Given the prevalence of using triton server for NLP-related workload, would suggest including the
transformers
library in the pre-built docker image if possible.Describe alternatives you've considered
There are other images like
24.03-trtllm-python-py3
that does come withtransformers
pre-installed, however we need to serve bert-like models and accordding to triton-inference-server/tensorrtllm_backend#368, there is no clear timeline to support this. So we have to rely on other backend (like ORT) to execute our model.Additional context
Any thoughts / suggestions will be greatly appreciated!
The text was updated successfully, but these errors were encountered: