What is the correct way of using batches? Help me understand how BentoML create batches #1255
-
Hi, I am implementing a simple face detector service with BentoML and I would like it to run as fast as possible so I am trying to use the batch processing feature. I set @bentoml.api(input=JsonInput(), batch=True)
def predict(self, request):
# my code here
return results Then I send the following request: [
{ "client_id": "AAABBB", "image": "aaaAAA"},
{ "client_id": "CCCDDD", "image": "bbbBBB"}
]
While running the service I realized BentoML is delivering the request to my function inside a list: [
[
{ "client_id": "AAABBB", "image": "aaaAAA"},
{ "client_id": "CCCDDD", "image": "bbbBBB"}
]
] What is making me confuse is the structure of the parsed request. The way it is being delivered to my function, I have only one item including my entire request, which is different from what is mentioned in the documentation, that tells me to iterate over a list with In my case here, I understand my input as a single request containing two different items, that should be batch processed. Am I doing/understanding this concept correctly? Please educate me on this topic. Thank you in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @maikelronnau, Good question. When you have For your case, if you are sending each item separately, BentoML will combine those requests together in a list.
|
Beta Was this translation helpful? Give feedback.
Hi @maikelronnau, Good question.
When you have
batch
mode on, BentoML will attempt to group the incoming requests together in a small batch.For your case, if you are sending each item separately, BentoML will combine those requests together in a list.