You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I figured out that the batcher expects inputs to be in the form of {"instances":[...]} and combines based on instances list length & maxBatchSize setting.
Let's say my custom transformer model expects inputs like: {"context": "hello", "questions": ['a','b','c',...]}
they are combined like so (assuming maxBatchSize=4):
combined_request = {
"instances": request1["instances"] + request2["instances"]
}
# which is equal to
combined_request = {'instances': [
{'context': 'hello', 'questions': ['a', 'b', 'c', 'd']},
{'context': 'bye', 'questions': ['e']}
]
}
But in my case I want to calculate batch size of my requests based on the length of the instances.questions key,
which in the example above is 4 & 1 so they shouldn't be combined into a batch.
I know I can flatten the model input structure so it is like 1:1context:question but due to other optimization concerns that is not useful to me.
So my question is: Is it possible to batch based on another field?
I figure it isn't currently possible, cause the batcher implementation seems pretty hardcoded on the expected input/output structure. I assume adding some flexibilty to how the batcher behaves through settings could also allow batcher to be compatible with v2 protocol requests #2275 too
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I figured out that the batcher expects inputs to be in the form of
{"instances":[...]}
and combines based oninstances
list length &maxBatchSize
setting.Let's say my custom transformer model expects inputs like:
{"context": "hello", "questions": ['a','b','c',...]}
So if 2 requests like below are made:
they are combined like so (assuming maxBatchSize=4):
But in my case I want to calculate batch size of my requests based on the length of the
instances.questions
key,which in the example above is
4 & 1
so they shouldn't be combined into a batch.I know I can flatten the model input structure so it is like
1:1
context:question
but due to other optimization concerns that is not useful to me.So my question is: Is it possible to batch based on another field?
I figure it isn't currently possible, cause the batcher implementation seems pretty hardcoded on the expected input/output structure. I assume adding some flexibilty to how the batcher behaves through settings could also allow batcher to be compatible with v2 protocol requests #2275 too
Beta Was this translation helpful? Give feedback.
All reactions