Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadscope incorrectly modifying test ordering, when ordering is relevant #1083

Open
Toad2186 opened this issue May 14, 2024 · 9 comments
Open

Comments

@Toad2186
Copy link

Toad2186 commented May 14, 2024

I'm not sure if there are other use cases, but the relevant change I'm talking about is a byproduct of #778.

With #778, we are now ordering tests based on the number of tests available. The problem is that this changes the input test order, and in some cases, such as when running with pytest-django, the test ordering is relevant. In particular, Django (and also pytest-django) orders its tests by grouping them:

(django) TestCase
TransactionalTestCase, SimpleTestCase
All other tests

Tests within each group doesn't need to be ordered, but the group ordering is relevant because TransactionalTestCase may have undesirable side-effects.

For simplicity, assume we run with one worker with --dist loadscope.

So if I have tests like this, for example:

class TestTransactionalTestCase1(TransactionalTestCase):
    def test_tttc_1...

    def test_tttc_2...

class TestCase1(TestCase):
    def test_tc_3...

When running with 1 worker, we would get this ordering:

test_tttc_1
test_tttc_2
test_tc_3

Whereas when we run without pytest-xdist, we would run in this order (and this is the "correct" behavior based on Django's ordering):

test_tc_3
test_tttc_1
test_tttc_2

More broadly, I think re-ordering the input test ordering may conflict with other scenarios where ordering does matter. It's not a Django-specific issue, although I don't know of any other examples where this is relevant.

@nicoddemus
Copy link
Member

I guess a solution would be to include a new ini-option that disables this ordering for loadscope?

cc @cryvate @joekohlsdorf

@cryvate
Copy link
Contributor

cryvate commented May 14, 2024

@nicoddemus I think that makes sense.

Actually thinking about it, I had misunderstood the change I made as I was not too aware of the internals of pytest-xdist.

I think it makes sense to do the following 2 things:

  • make the behaviour opt in (or opt out) using a new ini option and command line option
  • make the behaviour also supported in loadfile, load (I did not realize there was such a split).

This can be achieved in two PRs I reckon.

@joekohlsdorf
Copy link

The change which was done to loadscope in #778 does not reorder individual tests within a scope, it orders entire scopes by the number of tests within them.
With xdist you can't expect deterministic ordering of scopes because it will distribute scopes arbitrarily among workers.

On a sidenote, I disagree that TransactionalTestCase has side effects, the database is still rolled back after each test. Other types of side effects are usually issues with your tests and I don't recommend trying to avoid flakiness by enforcing a test order.

@nicoddemus
Copy link
Member

With xdist you can't expect deterministic ordering of scopes because it will distribute scopes arbitrarily among workers.

Good point, this invariant was not changed by #778, being there previously already.

@Toad2186
Copy link
Author

Toad2186 commented May 14, 2024

the database is still rolled back after each test.

The database is not rolled back after each test. The database is truncated after each test, which is a little different. One retains the state of the DB from the beginning of the test (with some exceptions, such as sequences which retain their incremented counts in most DB types); the other wipes the DB entirely.

It's not impossible to make TransactionTestCase work irrespective of the ordering of TestCase vs. TransactionTestCase, but it might be much less efficient. For example, if you preload a large amount of data into the DB, running a transaction test case would wipe the data, then, test case would fail. You can argue it's more correct that each test should preload its own data, and you wouldn't be entirely wrong and we can do that, but loading and deleting a bunch of data before and after each test can end up slowing down things, a lot, depending on the operations involved.

With xdist you can't expect deterministic ordering of scopes because it will distribute scopes arbitrarily among workers.

Good point, this invariant was not changed by #778, being there previously already.

The issue isn't deterministic ordering -- the issue is relative ordering of groups of tests. Again, it's not entirely wrong per-se to require that tests run irrespective of ordering, but this grouping is something that's been in Django since 2.0 and you can look at that commit for the same reasoning I mentioned above.

The ordering between groups, while not explicitly supported, was the case prior to #778 and was changed by #778. Which also means that, since then, unfortunately it's not 100% compatible with Django tests in all cases for test suites that take advantage of Django's grouping guarantee. And I get that not everything revolves around Django, so perhaps it's not a problem that you'd even care to solve, but IMO it makes sense to try to play as nice as possible with other libraries.

@nicoddemus
Copy link
Member

The ordering between groups

IIUC, previously due to the implementation it would always guarantee the order of the groups, so tests for group A and group B would for sure end up either in the same work (in that order), or in separate workers depending on number of workers/tests. But #778 indeed changed that so group B and group A might execute in that order on the same worker, in case group B has more tests (please correct me if I'm wrong, I'm drawing from memory as I'm short on time to look up at the code/issues with more detail).

@Toad2186
Copy link
Author

IIUC, previously due to the implementation it would always guarantee the order of the groups, so tests for group A and group B would for sure end up either in the same work (in that order), or in separate workers depending on number of workers/tests. But #778 indeed changed that so group B and group A might execute in that order on the same worker, in case group B has more tests (please correct me if I'm wrong, I'm drawing from memory as I'm short on time to look up at the code/issues with more detail).

Exactly. pytest-xdist does not guarantee group ordering either, so officially it's still correct, but different from what used to happen.

@nicoddemus
Copy link
Member

In that case, I think it is reasonable to add an option to make the behavior opt-out in case group order matters.

Why not the other way around? My reasoning is that we should always advise for tests not to be order-dependent, so guaranteeing group order by default goes against that. In addition, the performance benefits of ordering the groups by number of tests is significant so I think should be the default.

What do you folks think?

I would love to review a PR adding that option and accompanying tests. 👍

@Toad2186
Copy link
Author

I can take a stab at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants