Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure not to close the evaluator _ws_thread when done is not set yet #7901

Closed
wants to merge 3 commits into from

Conversation

xjules
Copy link
Contributor

@xjules xjules commented May 14, 2024

Issue
(Maybe) Resolves #7642

Approach
Make sure to wait until done is set.

  • PR title captures the intent of the changes, and is fitting for release notes.
  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.
  • Make sure tests pass locally (after every commit!)

When applicable

  • When there are user facing changes: Updated documentation
  • New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
  • Large PR: Prepare changes in small commits for more convenient review
  • Bug fix: Add regression test for the bug
  • Bug fix: Create Backport PR to latest release

Comment on lines 148 to 149
while not self._done.done():
time.sleep(0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is repeated three times; should we put it in a separate method called wait_until_done or sleep_until_done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced the time.sleep with threading.Event()

@berland
Copy link
Contributor

berland commented May 15, 2024

With this PR, the test

fix_loop$ pytest tests/unit_tests/ensemble_evaluator/test_ensemble_legacy.py -k run_and_cancel_legacy_ensemble[using_sch --count=10000 -n 48

passed. Looks good! 👍🏻

 - Use threading.Event to signal done to the main thread
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.32%. Comparing base (faa45eb) to head (d52fb7e).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7901      +/-   ##
==========================================
- Coverage   85.38%   85.32%   -0.06%     
==========================================
  Files         381      381              
  Lines       23593    23613      +20     
  Branches      881      891      +10     
==========================================
+ Hits        20145    20148       +3     
- Misses       3336     3358      +22     
+ Partials      112      107       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@eivindjahren
Copy link
Contributor

eivindjahren commented May 16, 2024

With this PR, the test

fix_loop$ pytest tests/unit_tests/ensemble_evaluator/test_ensemble_legacy.py -k run_and_cancel_legacy_ensemble[using_sch --count=10000 -n 48

passed. Looks good! 👍🏻

I played a bit with the repeat numbers and the failing behavior becomes consistent at about ~20 repeats. Each time it takes about 4 seconds to run the test so I think it seems reasonable to add

@pytest.repeat(20)

to the test so that we get consistent test results rather than flaky behavior. We run with pytest-xdist anyways so this should take neglible additional time (at least in comparison to the using_scheduler fixture). This also means adding pytest-repeat to the dev dependencies in pyproject.toml.

@eivindjahren
Copy link
Contributor

eivindjahren commented May 24, 2024

We talked about the repeat idea, and came up with the idea of just parametrizing the test:

@pytest.mark.timeout(60)
@pytest.mark.usefixtures("using_scheduler")
@pytest.mark.parametrize("num_reals, num_jobs", itertools.product(range(2, 6), range(2, 6)))
def test_run_and_cancel_legacy_ensemble(
    tmpdir, make_ensemble_builder, monkeypatch, num_reals, num_jobs
):
    print(num_reals, num_jobs)
    custom_port_range = range(1024, 65535)
    with tmpdir.as_cwd():
        ensemble = make_ensemble_builder(
            monkeypatch, tmpdir, num_reals, num_jobs, job_sleep=40, max_runtime=1
        ).build()
        config = EvaluatorServerConfig(
            custom_port_range=custom_port_range,
            custom_host="127.0.0.1",
            use_token=False,
            generate_cert=False,
        )

        evaluator = EnsembleEvaluator(ensemble, config, 0)

        evaluator.start_running()

        async def _run_monitor():
            async with Monitor(config) as mon:
                cancel = True
                with contextlib.suppress(
                    ConnectionClosed
                ):  # monitor throws some variant of CC if dispatcher dies
                    async for _ in mon.track():
                        # Cancel the ensemble upon the arrival of the first event
                        if cancel:
                            await mon.signal_cancel()
                            cancel = False

        run_monitor_in_loop(_run_monitor)
        assert evaluator._ensemble.status == state.ENSEMBLE_STATE_CANCELLED

        # realisations should not finish, thus not creating a status-file
        for i in range(num_reals):
            assert not os.path.isfile(f"real_{i}/status.txt")

@xjules
Copy link
Contributor Author

xjules commented May 29, 2024

Closing this PR as this one is not required anymore it seems. Relates #8000

@xjules xjules closed this May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

No running event loop error observed when doing heavy update
5 participants