[serve] deflake autoscaling tests #45358
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[serve] deflake autoscaling tests
Deflake
test_autoscaling_policy.py
which contains e2e autoscaling tests.test_handle_deleted_on_crashed_replica
would sometimes spit out an error (after the test passes) because the restarted Router replica fails to initialize because the controller already started shutting down all deployments. This PR adds a wait condition to make sure router replica is restarted before test ends.SignalActor
, with the same actor name, is used in a few of the tests. If Ray doesn't garbage collect fast enough, then the actor won't be killed fast enough and consecutive test runs will error out saying an actor of the same name already exists. This PR makes sure to kill the signal actor at the end of tests that use it.Signed-off-by: Cindy Zhang cindyzyx9@gmail.com