Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ai video fix selection pr #3033

Draft
wants to merge 4 commits into
base: ai-video
Choose a base branch
from

Conversation

ad-astra-video
Copy link
Contributor

@ad-astra-video ad-astra-video commented Apr 28, 2024

What does this pull request do? Explain your changes. (required)
Draft of ai-video selection algo fix.

Suspension was not working because the penalty was always 3. This logic was a carryover from transcoding where the suspender always started at a refresh count of 0 because a new session manager was created with each stream. For AI, we are reusing the session manager and the suspender so the refresh count does not reset between requests. The fix to suspension is to consider the current refresh count when calculating the penalty so it is 3 more than the current refresh count in the suspender.

There was also an issue where the discoveryPoolSize was always 100 and with limited orchestrators providing models a refresh of sessions was being done with every request. I added an initialPoolSize field to track the last refresh pool size to use with the shouldRefreshSessions logic rather than 100. This stabilizes the suspender to allow more orchestrators to be tried with each Select call.

Last update was moving the signalRefresh() for the suspender that increments the refresh counter in the suspender to the Refresh function makes it more stable that every time we refresh sessions we add to the suspender refresh count

The update to exlcude managed containers is dependent on ai-worker #72

Happy to segregate some of these changes to separate PRs. The suspension fixes can be added separately without dependency on ai-worker PR.

Specific updates (required)

  • Updates suspender to use the current refresh count of the suspender in the selector.
  • Moves penalty to the AISessionSelector to make it easier to update and available for calculations on the suspension needed
  • releases all Os when there are none in the warm and cold pool
  • Adds option to not use managed containers.

How did you test each of these updates (required)

I have been running these updates on my gateway. Tested 1-200 requests with 5-10 workers sending to gateway. All completed with 1-2 orchestrators providing Bytedance model.

Does this pull request close any open issues?

Checklist:

@github-actions github-actions bot added the AI Issues and PR related to the AI-video branch. label Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI Issues and PR related to the AI-video branch.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant