Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this pull request do? Explain your changes. (required)
Draft of ai-video selection algo fix.
Suspension was not working because the
penalty
was always3
. This logic was a carryover from transcoding where the suspender always started at a refresh count of 0 because a new session manager was created with each stream. For AI, we are reusing the session manager and the suspender so the refresh count does not reset between requests. The fix to suspension is to consider the current refresh count when calculating the penalty so it is 3 more than the current refresh count in the suspender.There was also an issue where the
discoveryPoolSize
was always 100 and with limited orchestrators providing models a refresh of sessions was being done with every request. I added aninitialPoolSize
field to track the last refresh pool size to use with theshouldRefreshSessions
logic rather than 100. This stabilizes the suspender to allow more orchestrators to be tried with eachSelect
call.Last update was moving the
signalRefresh()
for the suspender that increments the refresh counter in the suspender to theRefresh
function makes it more stable that every time we refresh sessions we add to the suspender refresh countThe update to exlcude managed containers is dependent on ai-worker #72
Happy to segregate some of these changes to separate PRs. The suspension fixes can be added separately without dependency on ai-worker PR.
Specific updates (required)
How did you test each of these updates (required)
I have been running these updates on my gateway. Tested 1-200 requests with 5-10 workers sending to gateway. All completed with 1-2 orchestrators providing Bytedance model.
Does this pull request close any open issues?
Checklist:
make
runs successfully./test.sh
pass