Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

“Learn” to avoid trying routes that do not affect the dependency graph #64

Open
uranusjr opened this issue Dec 8, 2020 · 49 comments
Labels
enhancement New feature or request

Comments

@uranusjr
Copy link
Member

uranusjr commented Dec 8, 2020

pip install "apache-airflow[devel_ci]==1.10.14rc1" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1-10/constraints-3.6.txt"

This (run against pypa/pip@fd8ddb6) would result in the resolver backtracking on setuptools, but these backtracks are obviously useless since setuptools does not have dependencies.

The resolver should be able to “look ahead” and see that trying other versions of setuptools would not change the conflicting graph, and skip working on them altogether. This applies not only to packages without dependencies (although they are the easiest target), but also versions that have the same dependencies. The idea of “the same dependencies” however require provider support (a new hook are_dependencies_equal or something) so those are more difficult to implement.

@uranusjr uranusjr added the enhancement New feature or request label Dec 8, 2020
@pfmoore
Copy link
Collaborator

pfmoore commented Dec 8, 2020

I'm not sure I understand. You say "setuptools does not have dependencies", but surely we can only say that "this specific version of setuptools does not have dependencies"? If we backtrack and try a different version of setuptools, we might get one that does have dependencies.

Am I missing something? (It's early and I haven't had a cup of tea yet, so it's quite likely 🙂)

@uranusjr
Copy link
Member Author

uranusjr commented Dec 8, 2020

Yes, sorry for the lax language, I was in a hurry writing the message. What I have in mind is the resolver should be able to look at (say) setuptools 40, and realise the rest of the dependency graph after pinning it would be exactly the same as setuptools 40.1. So if 40.1 failed, 40.0 is also destined to fail, and should be skipped entirely. But if setuptools x does have dependencies (or more generally, have a different set of dependencies from other versions we’ve tried), then we should try it.

@pfmoore
Copy link
Collaborator

pfmoore commented Dec 8, 2020

Ah, OK. So if we fail on 40.1, then the resolver can ask the provider "does 40.0 have the same dependencies?" and ignore 40.0 if the answer is "no".

I can see how that would help, but would it not be almost as good (and not require an additional mechanism) if the provider cached dependency data in memory, so that checking 40.0 would be fast enough that skipping it is not worth it? That would mean breaking pathological cases like projects that calculate dependencies randomly, but IMO it's fine for a provider to make that choice (and I'd be fine with pip doing so, as we already cache wheels which in effect does that).

Of course, it's not an either/or choice - we can do both.

@pradyunsg
Copy link
Contributor

pradyunsg commented Dec 8, 2020

IIUC, This is essentially the CDCL style behaviour - I'm not a 100% sure I understand how we'd phrase that in terms of criterion-based model we have here, but I'm a strong +1 on this.

@uranusjr
Copy link
Member Author

uranusjr commented Dec 8, 2020

The main things this is trying to avoid is not setuptools itself, but the states derived after setuptools 40.04 is pinned. If the resolution process is thought as a maze, setuptools 41.0 and 40.0 are two doors that lead to the same room. With naive backtracking, we’ll need to check all the doors that room has every time we enter it, but if we can somehow identify that the two doors (versions) bring us to the same room (state of criteria), we can avoid exploring the room the second time altogether. This can be done after we enter the room (i.e. pinning setuptools 40.0), but since each resolution round in resolvelib is a process of pinning + adding dependencies, doing it after entering the room (pinning) means we somehow need to abort a round half way; it is probably easier to “peek” before we walk through the setuptools 40.0 door (I think). The main challenge here would be to implement an efficient way to identify whether we indeed are entering the same room (i.e. implement equality on the criteria-held requirements in a state).

And yes, this is basically Conflict Driven Clause Learning (CDCL). We can express each resolvelib as an unsatisfied boolean formula,[citation needed] and everything beside setuptools!=40.0 is already known to evaluate to false, so we can perform unit propagation and derive setuptools!=40.0 must be true. But this can be explained in a human-friendly way, so I would prefer to not bring in martian language 🙂

@pfmoore
Copy link
Collaborator

pfmoore commented Dec 8, 2020

Cool. I checked the Wikipedia entry on CDCL, and I definitely prefer your explanation 🙂

@pradyunsg
Copy link
Contributor

pradyunsg commented Dec 8, 2020

Whoops! I should've used more words, and communicated more clearly. Here's my attempt at describing CDCL in simpler terms:

remembering about "things that conflict", so that it knows when it's gonna see the same conflict later, because it changed a component that doesn't affect the conflict.

"it" is the resolver.

As @uranusjr put it, the trick is figuring out "things that conflict" and efficiently detecting when that re-occurs.

@pradyunsg
Copy link
Contributor

Do you reckon #84 handles this?

@notatallshaw
Copy link
Contributor

notatallshaw commented Sep 26, 2021

Do you reckon #84 handles this?

I'm reading through Wikipedia's example of CDCL and it has a step analogous to what I wanted to implemented as part of #84: "Step 14 - Non-chronological back jump to appropriate decision level"

I believe this could make an even bigger difference on reducing the amount of backtracking, but the complexity looked to be too much to implement as a first PR. And I found that implementing just the part of preferring failure causes was good enough for almost all real world examples I could find.

@notatallshaw
Copy link
Contributor

I've been thinking about this further today and I would say the optimization I've implemented as part of #84 is only effective for what I would call "shallow conflicts", i.e. conflicts where by focusing on the requirements of the known failures at the current position in the dependency tree you are able to resolve those conflicts and find a solution for all the requirements without significant backtracking up the dependency tree.

However I would call a "deep conflict" where you have found failures but to find a solution it requires significantly backtracking up the dependency tree because those failures were pinned many rounds ago. An example of this is "airflow[all]==1.10.13".

My gut feeling is there are strong techniques that can tackle these problems. But let's see if they are reported first?

@notatallshaw
Copy link
Contributor

notatallshaw commented Oct 2, 2021

I've been mulling on this idea and I think I have devised a simple(ish) approach to implementing an analogous step to the "Non-chronological back jump" in CDCL and how that might be used to speed up resolution. Once #84 and pypa/pip#10481 land I'm going to start experimenting and see if I can actually implement it and if there are really use cases it can improve.

By definition it will actually change the behavior of the resolver itself, not just give the downstream library the opportunity to choose a different path, so if I can implement it and show it helps in some cases it's still probably going to take even more agreement that it's a good approach.

@notatallshaw
Copy link
Contributor

notatallshaw commented Oct 13, 2021

I've started work on this "Non-chronological back jump" and even though I've only implemented a naive solution I've already had very good success with it, I have 2 primary tests (both happen to be Python 3.7 only):

Test 1 (runs dozens of times faster with back jumps):

poetry2setup
wheel
twine

Test 2 (actually able to achieve ResolutionImpossible after ~5-10 mins [using cache] with back jumps):

apache-airflow[all]==1.10.13
For those curious here is the output of the apache airflow resolution impossible (click to expand)

ERROR: Cannot install apache-airflow and boto3 because these package versions have conflicting dependencies.

The conflict is caused by:
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.26 depends on botocore<1.17.0 and >=1.16.26
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.25 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.24 depends on botocore<1.17.0 and >=1.16.24
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.23 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.22 depends on botocore<1.17.0 and >=1.16.22
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.21 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.20 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.19 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.18 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.17 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.16 depends on botocore<1.17.0 and >=1.16.16
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.15 depends on botocore<1.17.0 and >=1.16.15
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.14 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.13 depends on botocore<1.17.0 and >=1.16.13
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.12 depends on botocore<1.17.0 and >=1.16.12
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.11 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.10 depends on botocore<1.17.0 and >=1.16.10
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.9 depends on botocore<1.17.0 and >=1.16.9
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.8 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.7 depends on botocore<1.17.0 and >=1.16.7
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.6 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.5 depends on botocore<1.17.0 and >=1.16.5
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.4 depends on botocore<1.17.0 and >=1.16.4
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.13.3 depends on botocore<1.17.0 and >=1.16.3
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.2 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.1 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.13.0 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.49 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.48 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.47 depends on botocore<1.16.0 and >=1.15.47
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.46 depends on botocore<1.16.0 and >=1.15.46
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.45 depends on botocore<1.16.0 and >=1.15.45
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.44 depends on botocore<1.16.0 and >=1.15.44
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.43 depends on botocore<1.16.0 and >=1.15.43
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.42 depends on botocore<1.16.0 and >=1.15.42
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.41 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.40 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.39 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.38 depends on botocore<1.16.0 and >=1.15.38
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.37 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.36 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.35 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.34 depends on botocore<1.16.0 and >=1.15.34
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.33 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.32 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.31 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.30 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.29 depends on botocore<1.16.0 and >=1.15.29
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.28 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.27 depends on botocore<1.16.0 and >=1.15.27
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.26 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.25 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.24 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.23 depends on botocore<1.16.0 and >=1.15.23
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.22 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.21 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.20 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.19 depends on botocore<1.16.0 and >=1.15.19
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.18 depends on botocore<1.16.0 and >=1.15.18
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.17 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.16 depends on botocore<1.16.0 and >=1.15.16
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.15 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.14 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.13 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.12 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.11 depends on botocore<1.16.0 and >=1.15.11
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.10 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.9 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.8 depends on botocore<1.16.0 and >=1.15.8
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.7 depends on botocore<1.16.0 and >=1.15.7
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.6 depends on botocore<1.16.0 and >=1.15.6
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.5 depends on botocore<1.16.0 and >=1.15.5
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.4 depends on botocore<1.16.0 and >=1.15.4
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.12.3 depends on botocore<1.16.0 and >=1.15.3
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.2 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.1 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.12.0 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.17 depends on botocore<1.15.0 and >=1.14.17
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.16 depends on botocore<1.15.0 and >=1.14.16
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.15 depends on botocore<1.15.0 and >=1.14.15
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.14 depends on botocore<1.15.0 and >=1.14.14
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.13 depends on botocore<1.15.0 and >=1.14.13
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.12 depends on botocore<1.15.0 and >=1.14.12
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.11 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.10 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.9 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.8 depends on botocore<1.15.0 and >=1.14.8
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.7 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.6 depends on botocore<1.15.0 and >=1.14.6
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.5 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.4 depends on s3transfer<0.4.0 and >=0.3.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.3 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.2 depends on botocore<1.15.0 and >=1.14.2
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.11.1 depends on s3transfer<0.4.0 and >=0.3.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.11.0 depends on botocore<1.15.0 and >=1.14.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.50 depends on botocore<1.14.0 and >=1.13.50
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.49 depends on botocore<1.14.0 and >=1.13.49
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.48 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.47 depends on botocore<1.14.0 and >=1.13.47
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.46 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.45 depends on botocore<1.14.0 and >=1.13.45
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.44 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.43 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.42 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.41 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.40 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.39 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.38 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.37 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.36 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.35 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.34 depends on botocore<1.14.0 and >=1.13.34
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.33 depends on botocore<1.14.0 and >=1.13.33
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.32 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.31 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.30 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.29 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.28 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.27 depends on botocore<1.14.0 and >=1.13.27
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.26 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.25 depends on botocore<1.14.0 and >=1.13.25
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.24 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.23 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.22 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.21 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.20 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.19 depends on botocore<1.14.0 and >=1.13.19
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.18 depends on botocore<1.14.0 and >=1.13.18
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.17 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.16 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.15 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.14 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.13 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.12 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.11 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.10 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.9 depends on botocore<1.14.0 and >=1.13.9
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.8 depends on botocore<1.14.0 and >=1.13.8
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.7 depends on botocore<1.14.0 and >=1.13.7
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.6 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.5 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.4 depends on s3transfer<0.3.0 and >=0.2.0
    moto 1.3.14 depends on botocore>=1.12.201
    boto3 1.18.60 depends on botocore<1.22.0 and >=1.21.60
    s3transfer 0.5.0 depends on botocore<2.0a.0 and >=1.12.36
    boto3 1.10.3 depends on botocore<1.14.0 and >=1.13.3
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.2 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.1 depends on s3transfer<0.3.0 and >=0.2.0
    boto3 1.18.60 depends on s3transfer<0.6.0 and >=0.5.0
    boto3 1.10.0 depends on s3transfer<0.3.0 and >=0.2.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

The idea of this approach is that if the failure causes have been previously pinned beyond the most recent pinning then to "back jump" to the state where they were last pinned and attach the new backtrack causes to that state.

This allows the failure causes to all be focused on together and means the resolution doesn't get stuck backtracking on some intermediate pinned requirement. It may also require an extra step of deciding if not all the backtrack causes are currently available to pin then pin the non-backtrack causes until they are, but I have not experimented with this yet.

In general this approach is a lot more tricky and potentially prone to infinite loops if not careful, but I think worth further exploration and I will continue to do so.

@pradyunsg
Copy link
Contributor

pradyunsg commented Oct 13, 2021

apache-airflow[all]==1.10.13

This failing doesn't make sense -- there is a solution for this in the space that the resolver is exploring?

Doesn't pip install "apache-airflow[all]==1.10.13" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.13/constraints-3.7.txt" work, in those cases?

@notatallshaw
Copy link
Contributor

notatallshaw commented Oct 13, 2021

apache-airflow[all]==1.10.13

This failing doesn't make sense -- there is a solution for this in the space that the resolver is exploring?

Doesn't pip install "apache-airflow[all]==1.10.13" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.13/constraints-3.7.txt" work, in those cases?

Trying this command on my experimental version of Pip, as well as 21.3 and 21.2.4 I get the error for all three:

ERROR: Cannot install apache-airflow because these package versions have conflicting dependencies.

The conflict is caused by:
    moto 1.3.14 depends on idna<2.9 and >=2.5
    The user requested (constraint) idna==2.10

@pradyunsg
Copy link
Contributor

Hah, interesting!

@uranusjr
Copy link
Member Author

Yes, apache-airflow 1.10.13 can not actually pass the new resolver due to internal conflicts. From my understanding they spent a ton of time cleaning this up to make 2.0 (1.10 was the last 1.x release) work without warnings on the legacy resolver.

@notatallshaw
Copy link
Contributor

Yes, apache-airflow 1.10.13 can not actually pass the new resolver due to internal conflicts. From my understanding they spent a ton of time cleaning this up to make 2.0 (1.10 was the last 1.x release) work without warnings on the legacy resolver.

Yes, I was a heavy users of apache-airflow when pip 20.3 was released which is what got me started trying to understand how pip's resolver works. Why I'm using apache-airflow[all]==1.10.13 as a test case is I don't believe anyone had got pip to ever run long enough to get to a ResolutionImpossible error, so it has been my benchmark to see if this approach is helpful for "real-world" cases.

@notatallshaw
Copy link
Contributor

notatallshaw commented Oct 15, 2021

Actually @pradyunsg is correct, airflow[all]==1.10.13 does have a solution. My latest experimental version of using back jumps is able to find a solution to this requirement pretty quickly.

I think I am hitting a bug in how resolvelib determines incompatible requirements, I'm not sure if it's my changes to enable "back jumps" or if it's something already existing, but given I haven't changed any of the code around incompatible requirements I think it might be an existing bug. I'll try and see if I can simplify the issue and come up with a test and make a new issue (may take a long time).

Click to expand constraints that will allow `airflow[all]==1.10.13` to be installed on Linux for Python 3.7

adal==1.2.7
aiohttp==3.7.4.post0
alabaster==0.7.12
alembic==1.7.4
amqp==2.6.1
analytics-python==1.4.0
ansiwrap==0.8.4
apispec==1.3.3
argcomplete==1.12.3
asn1crypto==1.4.0
astroid==2.8.2
async-timeout==3.0.1
atlasclient==1.0.0
attrs==20.3.0
aws-sam-translator==1.39.0
aws-xray-sdk==2.8.0
azure-common==1.1.27
azure-core==1.19.0
azure-cosmos==3.2.0
azure-datalake-store==0.0.52
azure-identity==1.7.0
azure-keyvault==4.1.0
azure-keyvault-certificates==4.3.0
azure-keyvault-keys==4.4.0
azure-keyvault-secrets==4.3.0
azure-mgmt-containerinstance==1.5.0
azure-mgmt-core==1.3.0
azure-mgmt-resource==20.0.0
azure-nspkg==3.0.2
azure-storage==0.36.0
azure-storage-blob==2.1.0
azure-storage-common==2.1.0
Babel==2.9.1
backcall==0.2.0
backoff==1.10.0
backports.entry-points-selectable==1.1.0
bcrypt==3.2.0
beautifulsoup4==4.7.1
billiard==3.6.4.0
bleach==4.1.0
blinker==1.4
boto==2.49.0
boto3==1.13.26
botocore==1.16.26
cached-property==1.5.2
cachetools==4.2.4
cassandra-driver==3.20.2
cattrs==1.8.0
celery==4.4.7
certifi==2020.12.5
cffi==1.13.2
cfgv==3.3.1
cfn-lint==0.54.2
cgroupspy==0.2.1
chardet==3.0.4
click==6.7
cloudant==0.5.10
colorama==0.4.4
colorlog==4.0.2
configparser==3.5.3
coverage==6.0.2
croniter==0.3.37
cryptography==2.9.2
cx-Oracle==8.2.1
datadog==0.42.0
decorator==5.1.0
defusedxml==0.7.1
dill==0.3.4
distlib==0.3.3
dnspython==1.16.0
docker==3.7.3
docker-pycreds==0.4.0
docopt==0.6.2
docutils==0.15.2
ecdsa==0.17.0
elasticsearch==5.5.3
elasticsearch-dsl==5.4.0
email-validator==1.1.3
entrypoints==0.3
execnet==1.9.0
fastavro==1.4.5
filelock==3.3.1
flake8==4.0.1
flake8-colors==0.1.9
flaky==3.7.0
Flask==1.1.4
Flask-Admin==1.5.4
Flask-AppBuilder==2.3.4
Flask-Babel==1.0.0
Flask-Bcrypt==0.7.1
Flask-Caching==1.3.3
Flask-JWT-Extended==3.25.1
Flask-Login==0.4.1
Flask-OpenID==1.3.0
Flask-SQLAlchemy==2.5.1
flask-swagger==0.2.14
Flask-WTF==0.14.3
flower==0.9.7
freezegun==1.1.0
fsspec==2021.10.1
funcsigs==1.0.2
future==0.18.2
future-fstrings==1.2.0
gcsfs==2021.10.1
gitdb==4.0.7
GitPython==3.1.24
google-api-core==1.31.3
google-api-python-client==1.12.8
google-auth==1.35.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-cloud-bigquery==1.27.2
google-cloud-bigquery-storage==1.1.0
google-cloud-bigtable==1.7.0
google-cloud-container==1.0.1
google-cloud-core==1.7.2
google-cloud-dlp==1.0.0
google-cloud-language==1.3.0
google-cloud-secret-manager==1.0.0
google-cloud-spanner==1.19.1
google-cloud-speech==1.3.2
google-cloud-storage==1.42.3
google-cloud-texttospeech==1.0.1
google-cloud-translate==1.7.0
google-cloud-videointelligence==1.16.1
google-cloud-vision==1.0.0
google-crc32c==1.3.0
google-resumable-media==1.3.3
googleapis-common-protos==1.53.0
graphviz==0.17
greenlet==1.1.2
grpc-google-iam-v1==0.12.3
grpcio==1.41.0
grpcio-gcp==0.2.2
gunicorn==20.1.0
hdfs==2.6.0
hmsclient==0.1.1
httplib2==0.20.1
humanize==3.12.0
hvac==0.11.2
identify==2.3.0
idna==2.8
ijson==2.6.1
imagesize==1.2.0
importlib-metadata==2.1.1
importlib-resources==5.2.2
inflection==0.5.1
ipdb==0.13.9
ipython==7.28.0
ipython-genutils==0.2.0
iso8601==0.1.16
isodate==0.6.0
itsdangerous==1.1.0
JayDeBeApi==1.2.3
jedi==0.18.0
jeepney==0.7.1
Jinja2==2.11.3
jira==3.0.1
jmespath==0.10.0
JPype1==0.7.1
json-merge-patch==0.2
jsondiff==1.1.2
jsonpatch==1.32
jsonpointer==2.1
jsonschema==3.2.0
junit-xml==1.9
jupyter-client==7.0.6
jupyter-core==4.8.1
jupyterlab-pygments==0.1.2
keyring==22.3.0
kombu==4.6.11
kubernetes==11.0.0
lazy-object-proxy==1.6.0
ldap3==2.9.1
lockfile==0.12.2
Mako==1.1.5
Markdown==2.6.11
MarkupSafe==2.0.1
marshmallow==2.21.0
marshmallow-enum==1.5.1
marshmallow-sqlalchemy==0.23.1
matplotlib-inline==0.1.3
mccabe==0.6.1
mistune==0.8.4
mock==4.0.3
mongomock==3.23.0
monotonic==1.6
more-itertools==8.10.0
moto==1.3.14
msal==1.15.0
msal-extensions==0.3.0
msrest==0.6.21
msrestazure==0.6.4
multi-key-dict==2.0.3
multidict==5.2.0
mypy==0.720
mypy-extensions==0.4.3
mysqlclient==1.3.14
natsort==7.1.1
nbclient==0.5.4
nbconvert==6.2.0
nbformat==5.1.3
nest-asyncio==1.5.1
networkx==2.6.3
nodeenv==1.6.0
nteract-scrapbook==0.4.2
ntlm-auth==1.5.0
numpy==1.21.2
oauthlib==3.1.1
oscrypto==1.2.1
packaging==21.0
pandas==1.3.3
pandas-gbq==0.15.0
pandocfilters==1.5.0
papermill==1.2.1
parameterized==0.8.1
paramiko==2.8.0
parso==0.8.2
pathspec==0.9.0
pbr==5.6.0
pendulum==1.4.4
pexpect==4.8.0
pickleshare==0.7.5
pinotdb==0.1.1
platformdirs==2.4.0
pluggy==0.13.1
portalocker==1.7.1
pre-commit==2.15.0
presto-python-client==0.7.0
prison==0.2.1
prometheus-client==0.8.0
prompt-toolkit==3.0.20
protobuf==3.17.3
psutil==5.8.0
psycopg2-binary==2.9.1
ptyprocess==0.7.0
pure-sasl==0.6.2
py==1.10.0
pyarrow==0.17.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle==2.8.0
pycparser==2.20
pycryptodomex==3.11.0
pydata-google-auth==1.2.0
pydruid==0.5.8
pyflakes==2.4.0
Pygments==2.10.0
PyHive==0.6.4
PyJWT==1.7.1
pykerberos==1.2.1
pymongo==3.10.1
pymssql==2.1.5
PyNaCl==1.4.0
pyOpenSSL==19.1.0
pyparsing==2.4.7
pyrsistent==0.18.0
pysftp==0.2.9
PySmbClient==0.1.5
pytest==5.4.3
pytest-cov==3.0.0
pytest-forked==1.3.0
pytest-instafail==0.4.2
pytest-rerunfailures==10.2
pytest-timeouts==1.2.1
pytest-xdist==1.34.0
python-daemon==2.3.0
python-dateutil==2.8.2
python-http-client==3.3.3
python-jenkins==1.7.0
python-jose==3.3.0
python-nvd3==0.15.0
python-slugify==4.0.1
python3-openid==3.2.0
pytz==2020.5
pytzdata==2020.1
pywinrm==0.4.2
PyYAML==6.0
pyzmq==22.3.0
qds-sdk==1.16.1
redis==3.5.3
requests==2.23.0
requests-futures==0.9.4
requests-kerberos==0.12.0
requests-mock==1.9.3
requests-ntlm==1.1.0
requests-oauthlib==1.3.0
requests-toolbelt==0.9.1
responses==0.14.0
rsa==4.7.2
s3transfer==0.3.7
sasl==0.3.1
SecretStorage==3.3.1
sendgrid==5.6.0
sentinels==1.0.0
sentry-sdk==1.4.3
setproctitle==1.2.2
six==1.16.0
slackclient==1.3.2
smmap==4.0.0
snakebite-py3==3.0.5
snowballstemmer==2.1.0
snowflake-connector-python==2.2.6
snowflake-sqlalchemy==1.3.2
soupsieve==2.2.1
Sphinx==4.2.0
sphinx-argparse==0.3.1
sphinx-autoapi==1.0.0
sphinx-copybutton==0.4.0
sphinx-jinja==1.1.1
sphinx-rtd-theme==1.0.0
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-dotnetdomain==0.4
sphinxcontrib-golangdomain==0.2.0.dev0
sphinxcontrib-htmlhelp==2.0.0
sphinxcontrib-httpdomain==1.8.0
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
SQLAlchemy==1.4.25
SQLAlchemy-JSONField==0.9.0
SQLAlchemy-Utils==0.37.8
sshpubkeys==3.3.1
sshtunnel==0.1.5
tabulate==0.8.9
tenacity==4.12.0
testpath==0.5.0
text-unidecode==1.3
textwrap3==0.9.2
thrift==0.15.0
thrift-sasl==0.4.3
toml==0.10.2
tomli==1.2.1
tornado==5.1.1
tqdm==4.62.3
traitlets==5.1.0
typed-ast==1.4.3
typing-extensions==3.10.0.2
tzlocal==1.5.1
unicodecsv==0.14.1
Unidecode==1.3.2
uritemplate==3.0.1
urllib3==1.25.11
vertica-python==1.0.2
vine==1.3.0
virtualenv==20.8.1
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==0.54.0
Werkzeug==0.16.1
wrapt==1.12.1
WTForms==2.3.3
xmltodict==0.12.0
yamllint==1.26.3
yarl==1.7.0
zdesk==2.7.1
zipp==3.6.0
zope.deprecation==4.4.0

@pradyunsg
Copy link
Contributor

That might be what #91 is about as well.

@pradyunsg
Copy link
Contributor

FWIW, is pip check happy with the final resolution result for airflow?

@notatallshaw
Copy link
Contributor

notatallshaw commented Oct 15, 2021

That might be what #91 is about as well.

Yes, going to be watching that issue, if a PR is developed I will test it against this use case.

FWIW, is pip check happy with the final resolution result for airflow?

Yes.

@uranusjr
Copy link
Member Author

Remember that pip check does not take extras into account so pip check passing isn't that meaningful for this particular case (IIRC the base Airflow resolves very quickly even on the first even resolver release, it only chokes when you start adding extras).

@notatallshaw
Copy link
Contributor

notatallshaw commented Oct 25, 2021

Remember that pip check does not take extras into account so pip check passing isn't that meaningful for this particular case (IIRC the base Airflow resolves very quickly even on the first even resolver release, it only chokes when you start adding extras).

Thanks for the info, I'll manually check if the extras are all also satisfied. But I think they are because I can pass the constraints file I generate from pip freeze in the experimental branch I have to pip 21.3 and it to install fine.

@notatallshaw
Copy link
Contributor

notatallshaw commented Dec 13, 2021

I've not had a lot of time to work on this and since pip 21.3 was released no one has reported any very significant backtracking times to the pip tracker so there hasn't been a ton of motivation. But a quick update on what I've tried so far and discovered:

I implemented backjumping by removing the previous states until the causes of the conflict are removed and then applying the preference order of which package to pin next. In the case of airflow[all]==1.10.13 it dramatically improved performance but in some of my other test cases it reduced the performance significantly (~5x slower) for what I think are three reasons:

  1. Backjumping with this approach inherently removes some information about existing compatibilities / conflicts
  2. One state does not represent one additional pinned package, it can be multiple pinned packages, therefore deleting a state can cause an "over backjump" where too many packages are unpinned
  3. Checks for infinite loops have to be put in to make sure you don't backjump to a state with the same pinning preferences more than once

I might experiment with the following changes:

  1. Have each state always represent one pinned package, this makes backjumps (and other graph algorithms) easier to reason about
  2. Have a unpin method that surgically removes a previously pinned package, only removing packages which are children of this package

However I had another idea that I think compared to backjumping would be less disruptive and potentially yield most of the benefit with almost none of the downside:

Rearrange the order in which the packages have been pinned before backtracking. So let's say you've pinned packages 1 to 10 but when you try to pin package 11 it conflicts with packages 2 and 8, the idea would be to reorder (as much as possible while not breaking the order derived from the dependency graph) the current pinning from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 to 1, 3, 4, 5, 6, 7, 9, 10, 2, 8. This way when a backtrack occurs it immediately has to resolve the conflicts with 2, 8, and 11.

It could also be the provider that tells resolvelib what the order should be, so any of the preferences can be controlled by the downstream system.

This is the approach I am most excited for and interested in and therefore when I get time will try working on unless anyone else has specific input on any of these.

@frostming
Copy link
Member

This could be solved by #113

@notatallshaw
Copy link
Contributor

notatallshaw commented Feb 4, 2023

This could be solved by #113

Agreed, I think in general it should be a big improvement to backtracking situations!

I'm going through some known problematic requirements (including the ones I mention in this comment pypa/pip#10481 (comment)) to try and find any that break or get much worse. I have already commented on the PR.

@wolfv
Copy link

wolfv commented May 29, 2023

Hey all, we've ported resolvelib to Rust (https://github.com/prefix-dev/resolvelib-rs) but for resolving conda packages still run into some performance issues that we don't see with libsolv. For that reason, we will probably spend some time to implement some version of CDCL in resolvelib-rs. I'll update you here if it works out and gives performance benefits!

@notatallshaw
Copy link
Contributor

Hey all, we've ported resolvelib to Rust (https://github.com/prefix-dev/resolvelib-rs) but for resolving conda packages still run into some performance issues that we don't see with libsolv. For that reason, we will probably spend some time to implement some version of CDCL in resolvelib-rs. I'll update you here if it works out and gives performance benefits!

Note that in the current implementation of resolvelib it's quite important what the downstream implements, in particular for real world dependency graphs I suspect there will be a big difference between prioritizing or not prioritizing the causes of the conflict in the get_preference method.

@wolfv
Copy link

wolfv commented May 29, 2023

@notatallshaw thanks for the note, indeed for the one problematic case we already found a significant speed up by implementing the same preference you proposed in pypa/pip#10479 .

It went from 50 secs to 19 seconds (which is still unacceptable since libsolv only takes <1 sec)

We'll do further research. Ideally I think the preference should not be necessary? Let's see. We'll also double check which heuristics libsolv is using :)

@pfmoore
Copy link
Collaborator

pfmoore commented May 29, 2023

It went from 50 secs to 19 seconds (which is still unacceptable since libsolv only takes <1 sec)

Just to confirm, you are aware that a key aspect of resolvelib is that it doesn't need to know the full dependency graph in advance? Because it can't make that assumption, comparing with libsolv (which does require the full graph up front to construct the SAT problem, I believe) needs to be done with caution, as you're not comparing like with like.

@notatallshaw
Copy link
Contributor

Ideally I think the preference should not be necessary?

Well Pip is dealing with graphs that are not known ahead of time, there is no giant file of every single dependency you can download from PyPI (unlike say conda-forge where you can). So preference is necessary because choosing the next candidate is not only about checking a solution but also gives you more or less information about the graph as a whole. The downstream library may know something about the candidate that will yield more information that can not be represented in a generic way to the graph solver.

If you are in a situation where you do know the entire graph ahead of time may I suggest that you investigate if resolvelib is the best solution or whether it's better to try and port to a more formal solver / satisfiability library.

That said I would love to see work on a real CDCL implementation for resolvelib, so if you go that route it would be great to see if the work can be backported.

@wolfv
Copy link

wolfv commented May 29, 2023

I wasn't aware :) thanks for the clarification & makes sense.

I am not sure that "the unknown part" is in the way of CDCL (since the backtrackable part of the graph is known for both pip and conda). I also think that libsolv "explores" the graph in a similar way as resolvelib does (libsolv does indeed preload all the repository data).
libsolv does order the dependencies by version (and more metrics, depending on the repository type) and then uses that as preference for the exploration. For backtracking it backjumps according to CDCL rules (I believe).

I am not an expert on these matters and don't want to make anyone mad! We've been super happy with the design of resolvelib and building the first prototype in Python took less than ~half a day (which motivated us to attempt a Rust port).

If we make any progress with CDCL I'll definitely let you know, and I hope that my messages aren't coming across as hostile :)

@notatallshaw
Copy link
Contributor

If we make any progress with CDCL I'll definitely let you know, and I hope that my messages aren't coming across as hostile :)

Not to me at least, the more places resolvelib is tested the better for any of the Python ecosystem that depends on Pip (which for now is still the vast majority of the ecosystem).

This conversation did trigger a thought on a performance improvement from an issue I remembered from awhile back and I filed an issue #131

@wolfv
Copy link

wolfv commented May 29, 2023

Also regarding using a more formal SAT library I am not sure they would work easily. We're not only solving a SAT problem, but more like a MaxSAT problem as far as I understand (since we also try to maximize version numbers). I looked at a couple of solvers today (e.g. the CP-SAT solver from or-tools https://developers.google.com/optimization/cp/cp_solver) or https://github.com/shnarazk/splr. I think the ortools solver could work (also for the maximization problem) but ideally we'd have a smaller thing that we can fully understand for the problem we're solving (package management).

@pfmoore
Copy link
Collaborator

pfmoore commented May 29, 2023

I hope that my messages aren't coming across as hostile :)

Far from it! It's great to see other uses of resolvelib, and it's also nice to get feedback from other implementations. There's a lot of difficult problems in this area, and extra perspectives always help 🙂

@notatallshaw
Copy link
Contributor

@notatallshaw thanks for the note, indeed for the one problematic case we already found a significant speed up by implementing the same preference you proposed in pypa/pip#10479 .

It went from 50 secs to 19 seconds (which is still unacceptable since libsolv only takes <1 sec)

If you have any time at all I would be curious if this change improves your performance at all #132 ?

It moves the responsibility of preferring backtrack causes to resolvelib rather than the downstream library and avoids an O(n2) issue in some circumstances, there was a previous real world report of this causing ~40% performance overhead issue for a user but their index was private so I could never reproduce such an extreme result myself.

@aochagavia
Copy link

aochagavia commented Jul 10, 2023

Update: we ended up using CDCL after all (we ported libsolv to Rust, and I wrote an introductory blog post explaining the CDCL approach to dependency resolution). Maybe @wolfv can tell you about how he plans to make it work for PyPI and not only Conda channels

@notatallshaw
Copy link
Contributor

notatallshaw commented Jul 10, 2023

Maybe @wolfv can tell you about how he plans to make it work for PyPI

I think the main question here will be how do you the handle the cost of potentially having to download and build a package each time the resolver wants to discover the dependencies of a node on the dependency graph? This is the requirement for resolvelib to build the graph iteratively and not just download the whole graph ahead of time.

Reading your blog and the linked paper to confirm I wasn't obviously missing anything, my understanding is traditional SAT solvers work well with conda because all the dependency information can be known ahead of time (I understand in practice for any given resolution in conda you might be downloading up to 4 files, and perhaps streaming the json lazily to save memory and/or CPU cost). So individual steps of checking boolean satisfiability don't need to worry about the cost of getting that information from the dependency graph.

Or perhaps I'm missing something and you can build the graph iteratively in a natural way using your approach.

@wolfv
Copy link

wolfv commented Jul 10, 2023

Hi @notatallshaw we're definitely not sure yet, how to do it.

Obviously the fact that many Python packages ship static metadata in wheel files is a big plus for us, so the worst case solution would be to build some sort of static index that only works with wheel files.

However, as the name of CDCL implies, clauses can (and are) added on the fly already (learnt clauses). Potentially this can be generalized to support adding dependency clauses on the go as well.

PubGrub seems to have this figured out, and conceptually that solver is not so far away from ours (the key difference is that PubGrub quite strongly suggests using ranges for version constraints, while we don't really care about the shape of the constraint).

I've also asked ChatGPT about this problem, and it came up with a bunch of existing "iterative" SAT solvers (ISAT, CryptoMiniSAT, or Glucose). Wether this is true I can't judge yet, I looked at CryptoMiniSAT for a second, and it seemed to indeed support adding clauses on the fly.

This makes me hopeful that we could also find some theoretical background for iterative SAT solving :)

Current status of this project is: working on the parts to retrieve metadata from PyPI's API. Once we've wired things up will let you know about the progress!

@notatallshaw
Copy link
Contributor

notatallshaw commented Jul 10, 2023

Well in terms of static metadata from wheels I will point your attention to PEP 658 / 714 which is live for new packages on Pypi, and my understanding is will be backfilled: https://discuss.python.org/t/pep-658-714-are-now-live-on-pypi/26693

FYI my understanding is that Poetry's resolution algorithm is based on PubGrub and while it works well for most cases, at least the way Poetry uses it it does have lots of real world situations that it gets stuck resolving for hours. But I don't keep a super close eye on Poetry, maybe things have got a lot better. Either way they do have a CDCL SAT solver that works with Pypi, so if you haven't perhaps worth looking at their implementation.

@wolfv
Copy link

wolfv commented Jul 10, 2023

Yep, those pep's are good news and I've been testing them out :)

@aochagavia
Copy link

Heads up, today the resolvo package solver was released, which should be flexible enough to resolve PyPI packages (@wolfv probably has more details).

@uranusjr
Copy link
Member Author

How would it work? The comment lacks so much details.

@wolfv
Copy link

wolfv commented Sep 27, 2023

Hey @uranusjr - we are now very close to sharing the complete tooling with the world (probably end of this week). Our new thing (called rip for rattler-installs-packages) uses resolvo in a lazy fashion to resolve PyPI / pip packages.
To achieve this we first made the library (used to be called libsolv-rs) generic (so that we can add any "DependencyProvider", similar to resolvelib or pubgrub-rs). And we spent a good deal of effort to make it lazy, too.

Hopefully the talk at PackagingCon can shed some additional light on how we achieved this. We'll also have some blog posts etc. But the high level is that you can provide your own packages and dependency matching mechanism to resolvo to resolve packages.

We'll come back asking for more feedback once the rip tool is public!

@wolfv
Copy link

wolfv commented Sep 28, 2023

If you want to have a look: https://github.com/prefix-dev/rip – would love to hear your feedback & thoughts!

@notatallshaw
Copy link
Contributor

If you want to have a look: https://github.com/prefix-dev/rip – would love to hear your feedback & thoughts!

Haven't been able to test it's performance yet as it seems to be getting tripped up easily over non-compliant requirements, or at least things it thinks are non-compliant, of real world packages, I filed some issues:

@baszalmstra
Copy link

Hey @notatallshaw ! Thanks for giving rip a try, I appreciate the effort! We'll take a look at these issues and fix them soon!

@pradyunsg
Copy link
Contributor

pradyunsg commented Oct 1, 2023

We'll come back asking for more feedback once the rip tool is public!

Looking forward to this getting sdist support -- once that's implemented, I expect we'd have a bunch of things from that to either directly reuse or borrow from it!

@notatallshaw
Copy link
Contributor

notatallshaw commented Dec 14, 2023

I'm starting to feel confident about my approach to favor conflicting causes, which, I believe, will ultimately complete resolvelib + pip's backtracking search as a truly CDCL-style algorithm.

I'm fairly sure resolvelib cannot implement this on its side because it can't guarantee enough about the requirements objects. While it could be implemented via get_preference on the Pip side, doing so would introduce a non-trivial O(n^2) issue, which would be detrimental for resolutions that don't benefit from CDCL.

Therefore, I propose an additional step in resolvelib that allows the provider to filter unsatisfied names before getting a preference: #145.

I've created an experimental branch of Pip: pip:prefer-conflicting-causes. So far, I've observed significant improvements:

  • Quickly solves several requirements that currently produce ResolutionTooDeep.
  • Even if backjumping is removed, it can solve pathological resolutions rapidly.
  • Minor performance improvement in resolution that doesn't benefit from CDCL, due to using a filter step to only select causes for get_preference, thereby avoiding the current O(n^2) issue.

I plan to conduct more testing with additional pathological requirements, both manually and, if possible, by converting some of the requirements into scenarios using pip-resolver-benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants