-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure not to close the evaluator _ws_thread when done is not set yet #7901
Conversation
while not self._done.done(): | ||
time.sleep(0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is repeated three times; should we put it in a separate method called wait_until_done
or sleep_until_done
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have replaced the time.sleep
with threading.Event()
With this PR, the test
passed. Looks good! 👍🏻 |
- Use threading.Event to signal done to the main thread
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #7901 +/- ##
==========================================
- Coverage 85.38% 85.32% -0.06%
==========================================
Files 381 381
Lines 23593 23613 +20
Branches 881 891 +10
==========================================
+ Hits 20145 20148 +3
- Misses 3336 3358 +22
+ Partials 112 107 -5 ☔ View full report in Codecov by Sentry. |
I played a bit with the repeat numbers and the failing behavior becomes consistent at about ~20 repeats. Each time it takes about 4 seconds to run the test so I think it seems reasonable to add
to the test so that we get consistent test results rather than flaky behavior. We run with pytest-xdist anyways so this should take neglible additional time (at least in comparison to the |
We talked about the repeat idea, and came up with the idea of just parametrizing the test: @pytest.mark.timeout(60)
@pytest.mark.usefixtures("using_scheduler")
@pytest.mark.parametrize("num_reals, num_jobs", itertools.product(range(2, 6), range(2, 6)))
def test_run_and_cancel_legacy_ensemble(
tmpdir, make_ensemble_builder, monkeypatch, num_reals, num_jobs
):
print(num_reals, num_jobs)
custom_port_range = range(1024, 65535)
with tmpdir.as_cwd():
ensemble = make_ensemble_builder(
monkeypatch, tmpdir, num_reals, num_jobs, job_sleep=40, max_runtime=1
).build()
config = EvaluatorServerConfig(
custom_port_range=custom_port_range,
custom_host="127.0.0.1",
use_token=False,
generate_cert=False,
)
evaluator = EnsembleEvaluator(ensemble, config, 0)
evaluator.start_running()
async def _run_monitor():
async with Monitor(config) as mon:
cancel = True
with contextlib.suppress(
ConnectionClosed
): # monitor throws some variant of CC if dispatcher dies
async for _ in mon.track():
# Cancel the ensemble upon the arrival of the first event
if cancel:
await mon.signal_cancel()
cancel = False
run_monitor_in_loop(_run_monitor)
assert evaluator._ensemble.status == state.ENSEMBLE_STATE_CANCELLED
# realisations should not finish, thus not creating a status-file
for i in range(num_reals):
assert not os.path.isfile(f"real_{i}/status.txt") |
Closing this PR as this one is not required anymore it seems. Relates #8000 |
Issue
(Maybe) Resolves #7642
Approach
Make sure to wait until done is set.
When applicable