New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test for sequence state after cancellation #7167
Conversation
sequence_id=1, | ||
sequence_start=True, | ||
) | ||
seq_start = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: why do we set this value to False
when it is overwritten on line 84 and not used before that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable should be used for the sequence_start
, you already caught it is always set to True
on another comment. The issue is fixed.
model_name, | ||
self._get_inputs(delay_itrs=5000000), | ||
sequence_id=1, | ||
sequence_start=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sequence_start=True, | |
sequence_start=seq_start, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
with grpcclient.InferenceServerClient("localhost:8001") as client: | ||
client.start_stream(callback) | ||
seq_start = True | ||
num_reqs = 4 | ||
seq_ids = [2, 3] | ||
for req_id in range(num_reqs): | ||
for seq_id in seq_ids: | ||
client.async_stream_infer( | ||
model_name, | ||
self._get_inputs(delay_itrs=0), | ||
sequence_id=seq_id, | ||
sequence_start=seq_start, | ||
) | ||
time.sleep(0.1) | ||
seq_start = False | ||
client.stop_stream(cancel_requests=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we perhaps refactor this into a common function looks like it is a repetition of the block above it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Updated: Regroup infer calls
Looks good to me. Please rebase the branch. |
Should be coupled with this 3rd party PR which adds the fix: triton-inference-server/core#341
Reproduction steps:
Since sequence state is identified by the sequence slot, so the two new sequences executing concurrently will share the same sequence state and leads to corruption of the state.
Ref: #7117 (comment)