Can we include commonly used data pre-processing library in triton server docker image? #7107

HQ01 · 2024-04-12T01:54:05Z

Is your feature request related to a problem? Please describe.

I find the current docker image xx.yy-py3 doesn't have commonly use data preprocessing libraries like huggingface transformers for accessing the tokenizer, for example. Missing this single missing package greatly limits our abilities to use triton-inference-server with its ensemble model feature.

In our specific usecase, pip install at runtime or using conda-pack are highly discouraged for various reasons. This is somewhat similar to #6467 and I believe might be common in many other industrial scenarios too.

Describe the solution you'd like

Given the prevalence of using triton server for NLP-related workload, would suggest including the transformers library in the pre-built docker image if possible.

Describe alternatives you've considered

There are other images like 24.03-trtllm-python-py3 that does come with transformers pre-installed, however we need to serve bert-like models and accordding to triton-inference-server/tensorrtllm_backend#368, there is no clear timeline to support this. So we have to rely on other backend (like ORT) to execute our model.

Additional context
Any thoughts / suggestions will be greatly appreciated!

The text was updated successfully, but these errors were encountered:

MatthieuToulemont · 2024-04-17T14:22:10Z

In our specific usecase, pip install at runtime

How about building your own image on top of the xx.yy-py3 ?
This way you will not run pip at runtime or require conda-pack

Given the prevalence of using triton server for NLP-related workload

In our case, we use triton for computer vision models and would not require transformers installed.

FROM nvcr.io/nvidia/tritonserver:XX.YY-py3                                                                                                                                                                                   
RUN pip install transformers --no-cache-dir

This Dockerfile will do what you need and will not require everyone having transformers installed by default ? Maybe this could work?

Tabrizian · 2024-04-19T20:58:51Z

Unfortunately, we cannot install these libraries as it can increase the container size significantly and there are many other customers asking for different libraries to be included. If we accommodate all these requests, the container size would be much larger than it already is. Creating conda-pack environments or custom images are our only recommendation at this point. Let us know if you have any other suggestions that might help with this issue.

Tabrizian added the question Further information is requested label Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we include commonly used data pre-processing library in triton server docker image? #7107

Can we include commonly used data pre-processing library in triton server docker image? #7107

HQ01 commented Apr 12, 2024 •

edited

MatthieuToulemont commented Apr 17, 2024 •

edited

Tabrizian commented Apr 19, 2024

Can we include commonly used data pre-processing library in triton server docker image? #7107

Can we include commonly used data pre-processing library in triton server docker image? #7107

Comments

HQ01 commented Apr 12, 2024 • edited

MatthieuToulemont commented Apr 17, 2024 • edited

Tabrizian commented Apr 19, 2024

HQ01 commented Apr 12, 2024 •

edited

MatthieuToulemont commented Apr 17, 2024 •

edited