[WIP] parallelise tests using unittest2 #332

stefantalpalaru · 2019-05-30T22:44:11Z

the good

time build/all_tests:

before	after
5.39s	2.46s

Relevant gains, right? Too bad that most of the time is taken by compilation, not the actual test running.

rm -rf nimcache; time make test:

before	after
2m 12s	1m 49s

Why the 23s difference when only test suites imported in "all_tests.nim" where parallelised and the runtime difference there is a little under 3s? My guess is that some changes in "test_helpers.nim" made the code faster to compile.

the bad

There's an occasional deadlock in unittest2's wrapper for threadpool's sync(). I hope it will be solved by the two commits I made to Nim's devel branch, but I need them to reach a public release before I can refactor unittest2 to depend on those fixes.

Until that's sorted out, this PR is in limbo.

the ugly

Adding one extra proc to the general state test runner was enough to bring back VM-related call-stack overflow issues. It's unfortunate that we're regressing on 17 JSON test files, but it's good to know that the workarounds in place were not something we could rely on in the long term.

I wasn't able to parallelise "test_rpc.nim" and "test_rpc_whisper.nim", due to a strange 'yield' only allowed in an iterator error when await appeared inside a test. Maybe it's something simple like an error inside compiler checks. Maybe it's something complex emerging from mixing async and threads.

the weird

After changing the jsonTest() logic, more test results appear in the corresponding *.md log files, and it's not just the newly added explicit skips. Hundreds of new, successfully run JSON test fixtures appear in those logs. I think it was a logic bug preventing this logging that was inadvertently fixed by changes meant to ensure GC-safety inside tests.

jangko · 2019-05-31T01:33:17Z

new entries in PersistBlockTests.md and TracerTests.md are positively incorrect. old entries already correct.
PrecompileTests.md also contains suspiciously similar entries like the two above, I assume it also incorrect.

I believe the bug is in tests/test_helpers.nim, perhaps a subtle non thread safe obscured somewhere.

stefantalpalaru · 2019-05-31T11:11:46Z

I believe the bug is in tests/test_helpers.nim

It was: I made the status table a global var, then forgot to clear it between test suites.

- threadpool.setMinPoolSize(2) to avoid AppVeyor OOM - jsonTest: clear the global "status" table after we're done with it - try to reduce maximum memory usage - tests: switch to debug builds - 32-bit Windows: set the IMAGE_FILE_LARGE_ADDRESS_AWARE flag so we can use PAE, if enabled, and access more than 2 GiB of RAM

This is #332 with parallelism disabled, while some threadpool deadlock is being debugged.

jlokier · 2021-05-26T22:32:37Z

Times have changed - now the testsuite is huge, and takes longer to run than compiling. The original patch won't apply any more, but the idea is welcome. Parallel tests would be great!

(Mentioning it here because the PR is still open. Might as well use it or close it.)

If there is an interest in bringing this change or another like it up to date, the comments below which discuss other things we'd like may be relevant (from #683 (comment)):

.. capture log output for each test individually and save it to a file (or xml/json report) so it can be viewed per test - specially per failed test. This requires fixing the test runner, rather than changing the production code. Finally, one could consider not printing anything for tests that succeeded - that typically requires a more sophisticated test runner that can show a progress bar or something like that - it would simply print fails.

If there ever is a more sophisticated test runner that captures output to hide it on success, it would be good to run tests in parallel too, and use the capture functionality to serialise their outputs, both to the log and to the terminal. I'm pretty sure there's a tool for this, but I don't recall its name.

stefantalpalaru · 2021-05-27T01:27:14Z

Unfortunately, threadpool still deadlocks occasionally.

arnetheduck · 2021-05-27T07:22:24Z

the (easy) way to run tests in parallel is to run them as separate processes - it doesn't require anything of the testing framework and makes sure things are nicely isolated - ie this is specially important because we don't use threading a lot and some of the libraries use globals and other thread-incompatible constructs that don't get tested a lot.

That said, transitioning to unittest2 should be done regardless - its biggest advantage is that it puts each test in a separate proc which gives it a cleaner environment and stack, and doesn't pollute the global table in nim which is limited. It's also in ut2 that we can add features such as capturing output easily.

Finally, @jangko was looking into mocha - writing a mocha collector that outputs the json (without using json-serialization) would also be done in unittest2.

jlokier · 2021-05-27T09:23:05Z

the (easy) way to run tests in parallel is to run them as separate processes

I agree. There are some downsides to performance but it's much better isolation, and things that might use globals, threadvars, timers, etc. need that isolation for the tests to mean anything. (And some tests would want to launch their own threads - e.g. threadpool tests).

Processes also solve the stdout/stderr-are-not-per-thread problem.

On unix fork is reasonably cheap, but not ultra-cheap. fork+exec is how you spawn a program normally, but that's less cheap, partly the exec, and partly program initialisations. Keep in mind we have some ~12500 or so tests currently, and we'd like to run another ~28000, so every 1ms adds significantly to the testing time.

Windows doesn't have fork. Cygwin simulates it in a complicated way, but not only are we not using Cygwin, Cygwin fork is slow anyway. The only practical choice on Windows is equivalent to unix fork+exec.

Between those two, I think there's merit in using a "processpool" to allow related tests to run in the same process sequentially, for tests we're reasonably confident are fine with that.

stefantalpalaru · 2021-05-27T12:37:19Z

transitioning to unittest2

Most of Nimbus-eth1 tests already use unittest2. Only these are still on unittest:

tests/test_rpc.nim
tests/rpcclient/test_hexstrings.nim
tests/test_rpc_whisper.nim
tests/test_graphql.nim

jlokier · 2021-05-27T13:53:38Z

If it's using unittest2, and the headline feature is parallelism, why don't I see any? It's 1 core all the way.

arnetheduck · 2021-05-27T14:24:45Z

the headline feature is parallelism

it's not, any more - it has to be enabled, and is not working reliably - there are other features in there however that are significant, in particular proc isolation for each test (fewer globals)

stefantalpalaru force-pushed the unittest2 branch from b3f2d0a to 27e3162 Compare October 31, 2019 02:28

stefantalpalaru added a commit that referenced this pull request Oct 31, 2019

unittest2 (parallelism disabled)

d84e481

This is #332 with parallelism disabled, while some threadpool deadlock is being debugged.

stefantalpalaru mentioned this pull request Oct 31, 2019

unittest2 (parallelism disabled) #409

Merged

jangko mentioned this pull request May 27, 2021

> transitioning to unittest2 #691

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] parallelise tests using unittest2 #332

[WIP] parallelise tests using unittest2 #332

stefantalpalaru commented May 30, 2019 •

edited

jangko commented May 31, 2019

stefantalpalaru commented May 31, 2019

jlokier commented May 26, 2021

stefantalpalaru commented May 27, 2021

arnetheduck commented May 27, 2021

jlokier commented May 27, 2021

stefantalpalaru commented May 27, 2021

jlokier commented May 27, 2021

arnetheduck commented May 27, 2021

[WIP] parallelise tests using unittest2 #332

Are you sure you want to change the base?

[WIP] parallelise tests using unittest2 #332

Conversation

stefantalpalaru commented May 30, 2019 • edited

the good

the bad

the ugly

the weird

jangko commented May 31, 2019

stefantalpalaru commented May 31, 2019

jlokier commented May 26, 2021

stefantalpalaru commented May 27, 2021

arnetheduck commented May 27, 2021

jlokier commented May 27, 2021

stefantalpalaru commented May 27, 2021

jlokier commented May 27, 2021

arnetheduck commented May 27, 2021

stefantalpalaru commented May 30, 2019 •

edited