Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow provider to filter unsatisfied names, when backtracking #145

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

notatallshaw
Copy link
Contributor

This allows the provider to optimize the steps it takes to choose what to backtrack on.

Specifically the aim is to add preferring conflicting causes in Pip without the overhead of recalculating or caching each time get_preference is called. I initially wanted to do a lot of this "preferring" work in resolvelib itself, but upon inspecting the resolution types I realized there was no guarantee of required information (which would have been a problem if this previous PR had landed #132).

I have a Pip branch that implements this filtering: https://github.com/notatallshaw/pip/blob/prefer-conflicting-causes/src/pip/_internal/resolution/resolvelib/provider.py#L240. Which on Python 3.8 allows it to solve the following requirement which currently produces a "ResolutionTooDeep" error: --only-binary ":all:" "numpy==1.21.6" "cython==0.29.28" "scipy>=1.4.0" "torch>=1.7" "torchaudio" "soundfile" "librosa==0.10.0.*" "numba==0.55.1" "inflect==5.6.0" "tqdm" "anyascii" "pyyaml" "fsspec>=2021.04.0" "aiohttp" "packaging" "flask" "pysbd" "pandas" "matplotlib" "trainer==0.0.20" "coqpit>=0.0.16" "pypinyin" "mecab-python3==1.0.5" "jamo" "bangla==0.0.2" "k_diffusion" "einops" "transformers"

@notatallshaw notatallshaw force-pushed the when-backtracking-allow-provider-to-filter-unsatisfied-names-before-picking-preference branch from 8cb7fec to 76af055 Compare December 1, 2023 05:13
@notatallshaw

This comment was marked as outdated.

@notatallshaw

This comment was marked as outdated.

@notatallshaw

This comment was marked as outdated.

@notatallshaw
Copy link
Contributor Author

I'm not really a fan of the name filter_unsatisfied_names, but I can't think of a better name right now.

@notatallshaw

This comment was marked as outdated.

@notatallshaw

This comment was marked as outdated.

@notatallshaw notatallshaw force-pushed the when-backtracking-allow-provider-to-filter-unsatisfied-names-before-picking-preference branch 2 times, most recently from 748e890 to dc0c4b9 Compare February 3, 2024 22:04
@notatallshaw
Copy link
Contributor Author

I've updated this PR based on the conversation here: pypa/pip#12497

Main changes are:

  • filter_unsatisfied_names -> narrow_backtrack_selection, still would like a better name, but I think it addresses the concerns raised in that thread
  • Significantly updated the docstring to explain the purpose of the method and how it differs to get_preference

@notatallshaw notatallshaw force-pushed the when-backtracking-allow-provider-to-filter-unsatisfied-names-before-picking-preference branch from dc0c4b9 to 97ff2cb Compare February 4, 2024 15:37
names, e.g., because the checks are prohibitively expensive.

Returns:
Iterable[KT]: A non-empty subset of `unsatisfied_names`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the docstring be extended to describe the meaning of the various arguments? For example, "information" doesn't really describe what the provider can expect to find in that value... 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, since this method now matches get_preference almost entirely (it didn't when I opened this PR) I've updated the docstring to far more closely match get_preference, including pulling in the parameter descriptions from there.

I also updated the method name to narrow_requirement_selection, as it fits the docstring descriptions better now, and I like that backtrack isn't part of the name (in fact I would like to remove "backtrack" from the parameter name as well, but that would be a breaking change for get_preference, I was the one that added that parameter name but I wasn't thinking about resolvelib in a wider context when I added it).

@notatallshaw notatallshaw force-pushed the when-backtracking-allow-provider-to-filter-unsatisfied-names-before-picking-preference branch 4 times, most recently from 9f2c547 to d853beb Compare February 6, 2024 23:37
Comment on lines 458 to 463
if len(filtered_unstatisfied_names) > 1:
name = min(
filtered_unstatisfied_names, key=self._get_preference
)
else:
name = filtered_unstatisfied_names[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need this if-else? min should return immediately if the length is 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min still calls then key function if there is 1 element, the point is to avoid an unnecessary call to get_preference

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the logic, it's worth a comment. (Also, I'd argue that min shouldn't do that, but I'm sure there's some pathological case out there that would break if it changed 🙁)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@notatallshaw notatallshaw force-pushed the when-backtracking-allow-provider-to-filter-unsatisfied-names-before-picking-preference branch 2 times, most recently from 1cf5bce to 2c36fdd Compare February 10, 2024 18:31
@notatallshaw notatallshaw force-pushed the when-backtracking-allow-provider-to-filter-unsatisfied-names-before-picking-preference branch from 5daed63 to c52395a Compare March 30, 2024 17:13
@notatallshaw
Copy link
Contributor Author

It occurs to me that resolvelib could validate that resolvelib could validate that the return value of self._p.narrow_requirement_selection is a subset of unsatisfied_names, this would guarantee that this method does not affect the soundness of the resolution, but at the cost of doing a subset check every backtrack iteration.

I am inclined to leave it to the documentation of how the client must implement narrow_requirement_selection.

Comment on lines +457 to +464
# If there is only 1 unsatisfied name skip calling self._get_preference
if len(narrowed_unstatisfied_names) > 1:
# Choose the most preferred unpinned criterion to try.
name = min(
narrowed_unstatisfied_names, key=self._get_preference
)
else:
name = narrowed_unstatisfied_names[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s worthwhile to add an explicit check to ensure narrowed_unstatisfied_names is not empty. User errors happen. Or maybe it’s viable to allow returning an empty list and do something sensible (what?) in the situation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants