Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make install command upgrade packages by default #3786

Closed
pradyunsg opened this issue Jun 8, 2016 · 99 comments
Closed

Make install command upgrade packages by default #3786

pradyunsg opened this issue Jun 8, 2016 · 99 comments
Labels
auto-locked Outdated issues that have been locked by automation C: upgrade The logic of upgrading packages

Comments

@pradyunsg
Copy link
Member

pradyunsg commented Jun 8, 2016

  • Pip version: Future versions, hopefully 10.0
  • Python version: All supported
  • Operating System: All supported

Based on discussion over at #59, there's interest in making the install command upgrade an installed package by default. This behaviour would make pip consistent with various other package managers, with regards to the behaviour of it's install command.

This issue is meant to be the location for that discussion, since this deserves it's own issue.

@njsmith
Copy link
Member

njsmith commented Jun 8, 2016

Okay, here's a proposal:

End goal (where we want to end up)

  • pip install foo: upgrades foo to the latest version; also does the minimum set of installs/upgrades required to satisfy the new version's dependencies
  • pip install -U foo / pip install --upgrade foo: identical to pip install foo (except maybe they should eventually issue some warning?); kept for back-compat
  • pip require foo: same as the current pip install foo; has the same effect as installing a package that has Requires-Dist: foo. This is a weird low-level operation that should not be emphasized in the docs, but we keep it for now to provide a less-bumpy transition, plus it exposes a meaningful operation we need to support anyway (the Requires-Dist handling), so it's likely useful for some scripting use case.
  • pip install --upgrade-recursive foo: same as the current pip install --upgrade foo -- ensures that foo is the latest version and ensures that all transitive dependencies are the latest version. This is a weird marginal option that should not be emphasized in the docs, but we keep it for now to provide a less-bumpy transition.
  • pip install --upgrade-non-recursive foo: same as the future pip install foo, but explicit to provide a less-bumpy transition.

Transition option A

  • Phase 0: what we have now -- pip install foo doesn't upgrade, pip install --upgrade foo does a recursive upgrade, pip require foo & pip install --upgrade-recursive foo are errors
  • Phase 1:
    • we add pip require foo, pip install --upgrade-recursive foo, pip install --upgrade-non-recursive foo.
    • pip install foo and pip install --upgrade foo continue to act like they do now, but are modified to check what they would have done if --upgrade-non-recursive were set, and issue a deprecation warning whenever what they actual do is different from what they will do in the future.
    • Users who want to opt-in to the future behavior (and silence the warnings) can use the usual configury to set --upgrade-non-recursive as their default (e.g. adding [install] upgrade-non-recursive = yes to pip.conf)
  • Phase 2:
    • pip install foo and pip install --upgrade foo switch to the new behavior.

Transition option B

KISS: skip phase 1 and go directly from phase 0 to phase 2. Rationale: it's not clear that this will actually break anything, people are going to be somewhat confused and annoyed in either case, it's entirely possible they'll be more confused and annoyed by the phased transition than by the actual change, we have limited resources, and we're eager to get to the shiny new future.

In this version we can also probably skip adding --upgrade-non-recursive, since its immediately redundant as soon as it's introduced.

Comment

I'm sorta expecting that everyone will push back and insist on transition option A instead of transition option B. But I'd actually be happy with either one, so instead of pre-emptively compromising I'm going to let someone to else to make that argument (if they want to) :-).

@xavfernandez
Copy link
Member

Hmm, I don't see the added-value of your pip require ? It looks like a duplicate of pip install ? Or maybe a pip install --no-upgrade ?

@pfmoore
Copy link
Member

pfmoore commented Jun 8, 2016

I'm happy enough with option B.

But I don't follow your description. You say pip require foo: same as the current pip install foo. So it'll error if foo is installed? And pip install --upgrade-recursive foo: same as the current pip install --upgrade foo. I thought there were problems with the existing install --upgrade behaviour (beyond it not being the default) - there's a whole load of discussion somewhere about needing a SAT solver. Is your proposal that we don't do anything about those issues? Or am I misremembering and there's not actually a problem with the current --upgrade behaviour?

@dstufft
Copy link
Member

dstufft commented Jun 8, 2016

I'm happy with option B.

I don't like the idea of a pip require command for the same reasons I didn't like the split pip install and pip upgrade commands. Two commands that do sort of the same thing but not quite forces people to make a decision about which one they use up front, versus using flags. I also think that it's good practice for boolean flags (ones that toggle something on/off) to have an inverse wherever it makes sense, to allow people to compose commands better.

So with all that in mind, here's what I would do:

  • pip install --upgrade ... now does a "minimal" upgrade by default, upgrading anything named on the command line/requirements file to the latest version, but only updating dependencies if required.
  • pip install --no-upgrade ... behaves as pip install does now, similarly to your pip require command, and just ensures that a version, any version, of the names requirements are installed.
  • pip install ... has it's default switched from an implicit --no-upgrade to an implicit --upgrade.

You might notice, that there's nothing like the current behavior listed so far, a "upgrade everything in the dependency path to the latest version" sort of flag. I'm on the fence about if we really want something like that (and if we want it, do we want to keep it forever, or would it just be a temporary shim to ease transition). Another thing to keep in mind when deciding this is how the theoretical pip upgrade command affects this decision. In other words, if we have a command to upgrade all the installed items, do we foresee people ever wanting to upgrade X and all of it's dependencies?

If we do want something like the current --upgrade behavior, then I think I see two options:

  • --recursive / --no-recursive To turn on the old or new behavior (but what would these do if --no-upgrade was selected? Silent no-op? Error?).
  • --upgrade-strategy=(minimal / recursive) to switch between two different strategies, a bit wordier than --[no-]recursive, but also makes it easier to add additional strategies if we ever find ourselfves in the need.

In terms of the dependency resolver, I don't think these two issues are really intertwined that much. Our resolving is currently a problem in both the pip install and the pip install --upgrade case, and I believe it will continue to be a problem with the proposed changes. It's something that needs fixed, but I don't think it has any bearing on what we do here (although it likely does have some bearing on the hypothetical pip upgrade command).

@pfmoore
Copy link
Member

pfmoore commented Jun 8, 2016

I'm not aware of a strong requirement for the current behaviour (by "strong" I mean "anything other than backward compatibility"). But if people did need it, they can get it by simply listing all of the dependencies on the command line.

It's pretty easy to write a script to show all (recursive) dependencies of a package:

# reqs.py
import sys
from pkg_resources import get_distribution

def reqs(req):
    results = []
    queue = [get_distribution(req)]
    while queue:
        current = queue.pop()
        results.append(current)
        for r in current.requires():
            d = get_distribution(r)
            queue.append(d)
    return results

if __name__ == '__main__':
    print('\n'.join(sorted(set(d.project_name for d in reqs(sys.argv[1])))))

Then you just do pip install $(reqs.py foo) to get an "eager install" of foo and its dependencies. I'm sure there are shortcomings with this approach, but is the problem common enough to warrant a more complex solution?

@dstufft
Copy link
Member

dstufft commented Jun 8, 2016

@pfmoore well that script only works if no dependencies have changed between the currently installed versions and the to-be-upgraded-to versions (and of course, it assumes everything is already installed).

That being said, the only real use case I can come up with is installing a project into an environment that already has stuff installed into it, but wanting to have the latest version of dependencies. IOW, a framework like Pyramid might prefer that new users install it's dependencies using the recursive upgrade. HOWEVER, even in this scenario, (which is the only one I can think of) if the hypothetical Pyramid's version specifiers are all correct, then the end user should expect it to work regardless (and it's similar in nature to what folks would get already in the current pip install behavior with something already installed).

If someone does want "Pyramid, and all of it's dependencies up to date", it's somewhat nicer than the proposed way of doing that (combining the two proposals), which would be pip install Pyramid && pip upgrade (which isn't exactly the same, since pip upgrade would do more than just Pyramid).

So that's my hesitation, is that I struggle to come up with a scenario where it's the clear cut right thing to do, but it could make some edge cases moderately nicer. We could always leave it out, and if we come across people asking for it add it in again at that point in time.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 8, 2016

I dislike both A and B. I don't like the idea of introducing a new command, nor do I want to switch to the new behavior without some "deprecation" style period for the current behavior. Hence I put forth my own proposal below.

I'm not aware of a strong requirement for the current behaviour

Me neither. Yet, I don't want to break someone's working code without telling them. I would find it rude. 'Don't do unto others what you don't want others to do unto you.' This is why I think I don't want to switch with no warning as in @njsmith's B option either.

If someone does want "Pyramid, and all of it's dependencies up to date", it's somewhat nicer than the proposed way of doing that (combining the two proposals), which would be pip install Pyramid && pip upgrade (which isn't exactly the same, since pip upgrade would do more than just Pyramid).

As I understand it, If someone wants "Pyramid, and all of it's dependencies up to date", after the switch to the new behavior, it's pip install --upgrade-strategy=eager Pyramid. That would eagerly upgrade Pyramid and it's dependencies to the latest version, regardless of whether an upgrade is unnecessary.

I thought it was clear that we wanted to provide both the current "recursive-latest" and the new default "only-if-needed" upgrades. Just emphasizes that I need to post the common accepted ideas.


Proposal

  1. Make a major version release that deprecates current behavior and provides a warning on use of these commands with opt-in flags and configuration to the new behavior.

    • Flag(s) should be provided to allow the user to check out the new behavior to be introduced. Using the flag(s) in this version would imply --upgrade.
    • maybe, pip install --upgrade warns that this flag will become no-op in next release.
    • pip install warn that the behavior is changing in the next release and current behavior won't be available in the next release.

    Possibly, both warnings provide a link to documentation that suggests to the user what they should do.

  2. Switch to new behavior in next major version release.

    • If someone really needs the current behavior, a --no-upgrade flag may be added. But I don't want to see that unless someone really needs it.

Bikeshed: Options and flags in 1. I prefer to add a --upgrade-strategy=(eager / non-eager / default) as the flag in 1 and switch the default strategy to eager in the 2.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 8, 2016

Also worth pointing out, explicitly, there is no need for a dependency resolver in pip for this. While with the new behavior it's still possible to break some line edge in the entire dependency graph, it becomes less likely if you upgrade less often.

how dependencies are handled

Uniformly independent of depth. The user can choose between eager and non-eager upgrades. They are as I had define in my earlier write-up.

what happens with constraints (and when they conflict)

I would say whatever happens today.

binary vs source

To be handled in #3785. Until then, keep as is.

@pfmoore
Copy link
Member

pfmoore commented Jun 8, 2016

I think it was clear that we wanted to provide both the current "recursive-latest" and the new default "only-if-needed" upgrades.

Nope, I don't think so. The "only if needed" behaviour is, as far as I know, agreed by everyone as what we would like to have available. But I understood the current behaviour to be generally considered as having issues. Whether those issues all revolve around the "pip needs a proper dependency resolver" problem, and we're OK with keeping the current behaviour until that is fixed, I don't really know.

@dstufft
Copy link
Member

dstufft commented Jun 8, 2016

The main problem(s) with the current behavior (that isn't actually a result of the lack of a real resolver) is that the "greedy-ness" of it causes things to be upgraded that might not otherwise be upgraded. On the tin that doesn't seem like a big problem, however it has some subtle (and some not so subtle) interactions:

  • It makes it more likely that something like sudo pip will inadvertently break someone's OS because it makes it more likely we'll recurse into a dependency provided by the OS (even if the user invoking pip had no idea that would be affected).
  • Some libraries are very expensive to install build, particularly ones like Numpy where compiling can take 30+ minutes.
  • The recursive upgrade introduces more churn on the installed set of packages, which increases the likelihood that something that was already working, breaks because of an upgrade to a shared dependency.

The first two of those are things that could possible be fixed, at least in part, by other solutions (and for which, this solution isn't a total fix either). You could fix the breaking of the OS by making pip smarter about not mucking around with the OS files by default. Wheels make it easier to install even hard to build libraries like Numpy but not everything has a Wheel, and if you're on anything that isn't Windows, OS X, or manylinux1 then your chances of getting a wheel are basically zero.

The churn on what is installed is only going to be fixed by this patch, as well as reducing the occurrence of the first two issues (by being more conservative when we actually attempt to do anything).

@dstufft
Copy link
Member

dstufft commented Jun 8, 2016

Of course, this is a super subtle sort of difference and it's hard to nail down all of the exact benefits (they'd be more accurately described as trade offs, rather than a straight set of benefits). I don't know if the old behavior is something that, in the cases it's useful, it's useful enough that people would bother using a flag for it or not. If we add the flag, it becomes hard to ever remove it, if we don't add it now, we could always add it again in the future, so for that reason i lean somewhat towards leaving it out and waiting to see if we get people asking for a way to bring the old behavior back.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 8, 2016

I think it was clear that we wanted to provide both the current "recursive-latest" and the new default "only-if-needed" upgrades.

Nope, I don't think so

Hmm... I did think that both behaviors were seen as useful. That's what the Pyramid example made me think. It's using the current behavior and it does exactly what is desired.

It seems desirable to be able to say "upgrade pkg and all it's (sub-)*dependencies to latest version". I don't want to upgrade everything in my ecosystem, I just want to get the latest bug-fixes for pkg and dependencies.

  • Some libraries are very expensive to install build, particularly ones like Numpy where compiling can take 30+ minutes.

By conservatively upgrading packages, it does make this happen less often.

Edit: You mentioned that.

  • The recursive upgrade introduces more churn on the installed set of packages, which increases the likelihood that something that was already working, breaks because of an upgrade to a shared dependency.

This needs a dependency resolver to be fixed. I consider that out-of-scope of this issue.

If we add the flag, it becomes hard to ever remove it, if we don't add it now, we could always add it again in the future, so for that reason i lean somewhat towards leaving it out and waiting to see if we get people asking for a way to bring the old behavior back.

That works pretty well with me. Adds to why I want a "deprecation" release for the current behaviour to get people asking for it to stay, rather than re-added.

Edit: s/version/behaviour/


😕 Any comments on my proposal above?

@dstufft
Copy link
Member

dstufft commented Jun 8, 2016

  • The recursive upgrade introduces more churn on the installed set of packages, which increases the likelihood that something that was already working, breaks because of an upgrade to a shared dependency.
    This needs a dependency resolver to be fixed. I consider that out-of-scope of this issue.

No, this isn't related tho the dependency solver thing. This is just "software is hard, and new versions sometimes add new bugs, therefore, the more churn you have, the more likely you are to get bit by new bugs".

The most stable (in terms of new, not previously encountered bugs) software is software that never changes.

Any comments on my proposal above?

I'm a little concerned about adding a warning for every invocation of pip install, but I'm not opposed to it-- it's certainly the safer route though and it's one that's more in line with our typical deprecation process and it gives a chance for people to clamor for an option to use the old behavior.

I do think that we need to either deprecate the --upgrade flag completely as part of this (probably no-op it and hide it for a long while), or we need to add --no-upgrade to get back to the old behavior of pip install .... I don't want a fairly useless --upgrade flag laying around in our help. So then the question for a --[no-]upgrade flag becomes whether we see the current behavior of pip install useful at all. Here again I don't have a strong opinion-- We could use the deprecation period again as a chance to see.

@pfmoore
Copy link
Member

pfmoore commented Jun 8, 2016

Any comments on my proposal above?

Honestly, I really don't like the idea that essentially every invocation of pip install will give a warning for a full major release cycle. That seems guaranteed to just annoy users, and as a result we'll probably get no useful feedback, just a lot of complaints about the process.

My preferences remain with @njsmith's approach - probably the "just go for it" approach, but if necessary the gradual version.

I have to admit that I find it very hard to understand the impact on my day to day usage of these various proposals. There's a lot of theory and edge cases being discussed, which is obscuring the key points. I think that whatever transition process we adopt, someone should work on a clear "press-release" style description of the proposed changes and their impact, which we can publish on distutils-sig before making the changes. That should allow us to gauge reactions from the wider community. (I don't think this needs a PEP, but I do think it needs publicising).

My instinctive feeling is that I'll be (mildly) happy by the new "as little as possible" upgrade behaviour, mildly irritated by the fact that "install" now upgrades without an explicit flag (but hopefully I'll get used to it reasonably quickly) and otherwise mostly indifferent. My main usage will probably remain pip install new_thing to install a new package and a manual "get all the package names, and do pip install <all of them at once> to manually simulate "update all". Neither of these will be affected by any of the proposals (except that the new "as little as possible" upgrade strategy will avoid the odd unwanted numpy upgrade attempt that the current behaviour inflicts on me).

For me, the tipping point comes when --prefer-binary and "upgrade all" become available. Those will affect my usage, and it won't really be until then that I'll see any benefits (or issues) with the change to upgrade strategy.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 8, 2016

Honestly, I really don't like the idea that essentially every invocation of pip install will give a warning for a full major release cycle. That seems guaranteed to just annoy users, and as a result we'll probably get no useful feedback, just a lot of complaints about the process.

Indeed. I didn't think about that in a hurry to leave. Oops!

My point is, I really want pip itself to have a major version deprecation run with such a major change to the main command of it. Any form it takes, I'm game.

I think being selective about when we show the warning message is the way forward.

How do you choose? @njsmith suggested only when the behaviour differs. Other than the fact that it's essentially doubling the work done in every install execution, as long as we publicise well (in advance and detail), I think it's good idea.


edit

Or maybe not on second thought. It won't be showing the message to everyone like we would want to. I would want to show it to everyone at least once.

How about some configuration file magic, asking the user to set a flag in the configuration file? This is where an --upgrade-strategy=default or similar flag would come in handy.

Any alternate ideas for this?


the tipping point comes when --prefer-binary and "upgrade all" become available. Those will affect my usage, and it won't really be until then that I'll see any benefits (or issues) with the change to upgrade strategy.

True. While this change will fix some issues (unnecessary re-installs) directly, I think it will might indirectly help resolve other issues as well.

@FichteFoll
Copy link

Similarly to @pradyunsg's last idea, iirc git shows (kinda long) messages for when it introduced or is going to introduce a big change that you can disable by setting a configuration via commandline that is mentioned in the message. I've liked that so far.

@dstufft
Copy link
Member

dstufft commented Jun 8, 2016

A temporary option to disable the message wouldn't be the worst possible behavior.

@njsmith
Copy link
Member

njsmith commented Jun 9, 2016

@pradyunsg: Before we get into the nitty-gritty of deprecation strategies... is there any chance I can convince you that the "option B" approach is okay? (Normally I wouldn't try, but given that core devs like @dstufft and @pfmoore are okay with it I guess I will try :-).) I definitely understand why you find just switching to be "rude" to users, but it's a complex trade-off -- not switching is also rude in different ways to different people. For example:

  • The longer we delay the switch, the longer we're continuing to inflict the annoying current behavior on our users -- note that Add "upgrade" and "upgrade-all" commands #59 has 199 comments from 56 participants, many of them just +1's. Making them wait another year is kinda rude too.
  • Deprecation periods are complicated and difficult -- they intrinsically impose extra costs on users. Pip gets 10 million downloads/month just from PyPI, so e.g. your proposed message will be shown at least 10 million times. Multiply by how long it takes to read something like that, make some decision, update some config file, etc., and then maybe do it over again in a year when the defaults actually switch.
  • If we're ever going to get this swamp drained then at some point we gotta get moving. Waiting a year between each improvement is really painful.
  • And deprecation cycles are costly for developers -- we're already extremely, extremely short on developer resources, so there's a very real cost to spending time implementing complex deprecation logic, keeping track of the schedule, coming back a year later and reminding ourselves what we decided, etc. That's time that could be spent on improving warehouse, implementing a proper resolver, pushing forward --prefer-binary, etc. etc. It's not enough to say "a deprecation is important", one has to argue that it's more important than other things one could do with that time.

8.1.2 flat out broke a bunch of people's deployments due to a complicated bug involving the interaction between pip, pkg_resources, and devpi. It sucked but people dealt with it. Given our limited resources, it's a fact that we're going to sometimes break things and sometimes leave broken things sitting for years without progress and generally cause users pain. We can't change that, but we can at least be smarter about which kinds of pain we cause users, and "install starts working the way lots of users already expect" is a much more productive outcome than most :-).


@pfmoore:

You say pip require foo: same as the current pip install foo. So it'll error if foo is installed?
No, right now if foo is already installed then pip install foo does nothing and exits successfully. I was imagining pip require would be a way to directly talk to the constraint resolver: "here's a new constraint, please ensure it is satisfied". Semantically meaningful and well-defined, but a pretty low-level for-experts interface.

@dstufft: I find pip install --no-upgrade foo rather confusing, though -- from the name I'd expect that it would do something like... try to install foo but error out if foo had a dependency that would force the upgrade of something I already had installed? Which is kinda the opposite of what it would actually do. For me the require operation and the install operation are conceptually really distinct -- see also Guido's comments on how if you ever find yourself writing a function that takes a boolean arg, and you know that your callers will be passing a constant rather than a variable for that arg, then you should have two functions. So splitting it out into a new command was me trying to imagine what it might look like in a world where we added it for its own sake, rather than just to fulfill our obligation to have a --no form of --upgrade or whatever. But I'm also just as happy to drop it entirely for now...


Okay, how about this as a strategy:

  • 9.0 makes pip install foo = pip install --upgrade foo = non-recursive upgrade
  • We make a nice little writeup explaining the actual effect this has (pip install foo now will upgrade if foo is installed; pip install --upgrade foo will no longer upgrade all dependencies recursively)
  • We provide some script like @pfmoore's above and in the release notes say "if you really want a recursive upgrade, try this..."
  • We make a mental note to consider adding a pip require foo command in the future if it turns out to be useful, but defer that for now because it's not really a priority and it's easier to add stuff than to take it away
  • We keep --upgrade around as a no-op indefinitely, but take it out of --help, and the reference manual just says "no-op; kept for backwards compatibility". (Maybe in a few years we tear it out entirely, maybe not -- I don't care and am happy to just defer that discussion until a few years have passed.)

That avoids the worst gratuitous breakage (there's no reason for pip install -U foo to become a hard error and invalidate tons of existing tutorials), but otherwise keeps things radically simple, so we can skip or defer thinking about things like --no-upgrade or the most ideal spelling for recursive upgrades and get the important parts moving ASAP.

@njsmith
Copy link
Member

njsmith commented Jun 9, 2016

It seems desirable to be able to say "upgrade pkg and all it's (sub-)*dependencies to latest version". I don't want to upgrade everything in my ecosystem, I just want to get the latest bug-fixes for pkg and dependencies.

The problem with this is that in lots of cases, it doesn't really make sense to assign some dependency to any particular dependant. Like, lots of people have environments with ~30 different packages installed, of which 1 is numpy and 29 are packages that depend on numpy. So if I want the new bug-fixes for astropy, should that upgrade my numpy? That might fix some issues with astropy but it might also break the other 28 packages, who knows. Pyramid's dependency chain includes a number of widely-used utility libraries like zope.interface and repoze.lru and setuptools (why? idk). So recursively upgrading Pyramid might break Twisted (which depends on zope.interface and setuptools and nothing else). There's no way that "I want the latest bug-fixes for Pyramid" implies "I want the latest setuptools" in most users' minds -- but that's how pip install -U currently interprets it.

@pradyunsg
Copy link
Member Author

Similarly to @pradyunsg's last idea, iirc git shows (kinda long) messages for when it introduced or is going to introduce a big change that you can disable by setting a configuration via commandline that is mentioned in the message.

That's exactly where I got the idea.

I've liked that so far.

Ditto. Hence I would like to see it in pip. It's a field-tested process.

I do agree that every-run-warning is a bit too much but having it show all the time until the user acts on it is something I know, from git, works even for major changes like this.

is there any chance I can convince you that the "option B" approach is okay?

Maybe. You're right the trade-offs are complicated and having to wait an year till the switch isn't the most convenient thing either. Breaking certain niche-cases that don't affect everyone is fine. That is just going to happen. Here, we're changing the most used command of pip (in documentation of packages and otherwise). Doing so without a proper warning period might just not be the best of things to do. Nor should this be done without giving people some time to fix their tools/workflow/etc to work with the new behaviour.

With @njsmith's current proposal, I still don't get a proper warning or give people some preview of the upcoming (major) change. That's all but it's enough that I don't like the proposal. If someone can convince me that dropping the these two requirements would be fine and it's possible to properly inform people that this, a big change, is coming their way in some other manner, I'm fine with that.

If we get the deprecation nitty-gritties right, it should possible to implement this in such a manner that the deprecation-release-only stuff stays in one module (module as in English; a class, function or something else) and the next major release just stops invoking that module and removes it. That way at least the post-deprecation work is minimized.

#59 has 199 comments from 56 participants, many of them just +1's. Making them wait another year is kinda rude too.

They don't have to wait another year. They can just opt-in to the new behaviour. We're just giving time to people whose stuff broke due to the change. Others can just opt-in to the nicer behaviour.

We keep --upgrade around as a no-op indefinitely, but take it out of --help, and the reference manual just says "no-op; kept for backwards compatibility". (Maybe in a few years we tear it out entirely, maybe not -- I don't care and am happy to just defer that discussion until a few years have passed.)
[snip?]
That avoids the worst gratuitous breakage (there's no reason for pip install -U foo to become a hard error and invalidate tons of existing tutorials)

If it wasn't obvious, this would happen in my proposal's 1. No one gets bothered by a no-op -U's presence. It's absence will invalidate many packages' documentation and break stuff. We'll keep it till it is rare enough to be safe to remove. That discussion should happen a few years later. (let's mark 16th September 2018 for this, for no reason what so ever)

Regardless of whether I change my position on @njsmith's proposal, we'll keep a no-op --upgrade post-deprecation.


There's no way that "I want the latest bug-fixes for Pyramid" implies "I want the latest setuptools" in most users' minds -- but that's how pip install -U currently interprets it.

True. But this is due to the lack of a dependency resolver. Once it's added, it does exactly what the user wanted. There's only so much we can do till then. Adding a warning in the documentation about the potential breakage of the dependencies is sufficient for now IMO, since this behaviour shall become opt-in. And this assumes that the packages maintain their promises made through version-numbers. If they break, there's little pip can do until packages refine their version-specifiers.

As a side, I think there should be a piece of documentation mentioning that pip may break your dependency graphs.

So if I want the new bug-fixes for astropy, should that upgrade my numpy?

Not if it breaks your dependency graph. Neither if it removes your well-configured numpy. The former case needs a dependency resolver. The latter needs "holding back" of upgrades. Both out-of-scope in this discussion.

Until we get those, the most we can do is tell people - "pip doesn't do the right thing all the time and we don't have the resources to fix it. Help would be appreciated."


This is just "software is hard, and new versions sometimes add new bugs, therefore, the more churn you have, the more likely you are to get bit by new bugs".

I can only say, sad but true to this.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 9, 2016

I am posting what is the mental picture of the post-deprecation behaviour is in my head... Just to make sure I don't miss out on anyone's concerns.

  • pip install upgrades in a non-eager manner, upgrading dependencies only-if-needed.
    • TBD: if also want to add a no-op flag which depends on deprecation path
  • pip install --some-flag upgrades in an eager manner, upgrading dependencies to the latest version allowed by version-specifiers.
    • TBD: if wanted
  • --upgrade becomes a no-op. It is kept in install --help, documented as "kept for backwards compatibility".
    • TBD if it is removed from help, I say no
  • pip require is deferred until someone comes around asking for it. As note below, this cannot be the case. (edit: it later turned out that I was wrong.. :| )

Once we have decided upon the required behaviour, I'll start working on the implementation. (I'm still familiarizing myself with the implementation details of pip install and #3194 right now.)

Let's finalize the behaviour and how we want to do the deprecation here and we'll bikeshed the option names in the PR I eventually make.


pip install --target <dir> is documented as "By default this will not replace existing files/folders in

."

Since install shall now start upgrading (replacing) by default, it seems more consistent to replace the existing files and folders by default and provide some flag if the user wishes to have the older behaviour of not-replacing. AFAIK, this flag is undecided on. pip require has similarities. So, I think we can't defer the discussion on pip require and need to do it now.

The overlap with pip install and the need for it presented by install --target makes me want to have the require behaviour behind a flag in install.

@njsmith
Copy link
Member

njsmith commented Jun 9, 2016

@pradyunsg:

Here, we're changing the most used command of pip (in documentation of packages and otherwise). Doing so without a proper warning period might just not be the best of things to do. Nor should this be done without giving people some time to fix their tools/workflow/etc to work with the new behaviour.

It's the most used command of pip, but we're only touching two weird corner cases: pip install foo where foo is already installed, and pip install -U foo where foo has some recursive dependency that's out of date. While I'm sure there will be some obscure breakage no matter what we do, I can't think of any sensible tools or workflows that would be broken by this -- can you give an example of what you're thinking of?

True. But this is due to the lack of a dependency resolver. Once it's added, it does exactly what the user wanted.

??? no idea what you mean here -- Pyramid recursively depends on setuptools, and my argument is that this demonstrates that "package and its recursive dependencies" doesn't actually correspond to any meaningful concept in the user's mental model. AFAICT this is totally orthogonal to the dependency resolver issue?

pip install --target <dir> ... Since install shall now start upgrading (replacing) by default, it seems more consistent to replace the existing files and folders by default

I think the issue with pip install --target <dir> is that it doesn't really install into an environment at all -- it's used for things like vendoring. And without an environment, the upgrade/install distinction doesn't even make sense. My vote is that we leave it alone -- the current behavior is fine IMO.

pip require has similarities.

It does?

@pradyunsg
Copy link
Member Author

we're only touching two weird corner cases: pip install foo where foo is already installed, and pip install -U foo where foo has some recursive dependency that's out of date.

Hmm... Indeed. While the change is major, I do agree that it's just weird corner cases that we break. But I would really want to get some user input before making the change... It doesn't feel right to make such a change without a deprecation.

If everyone else here (mainly @pfmoore and @dstufft) says that they prefer no-deprecation switch over a deprecation switch, I guess I'll be fine with going ahead and implementing @njsmith's proposal.

True. But this is due to the lack of a dependency resolver. Once it's added, it does exactly what the user wanted.

Pyramid recursively depends on setuptools, and my argument is that this demonstrates that "package and its recursive dependencies" doesn't actually correspond to any meaningful concept in the user's mental model.

I disagree. It is a meaningful thing to want to get the latest possible version of a package and its dependencies. As an example, if I have found that my current environment has an issue related to pkgA, I would want to check against the latest releases of it and all it's dependencies to eliminate the possibility of this being an issue that got fixed in a new release. I think it's reasonable to expect that to be possible.

Just to be clear, Let's not provide the old behavior for the simple reason that it provides lazy people a way to keep the existing behavior if it works for them. We'll keep it only if we figure out some valid use-case. If we go down the deprecation path, it'll be deprecated but available till end-of-deprecation. If someone wants that behavior, they'll say they do and we'll pull it out of deprecation and let it stay.

AFAICT this is totally orthogonal to the dependency resolver issue?

The dependency resolver comes into play when A and B both depend on C, A is recursively upgraded, breaking C for B since pip does not care about B's version specifiers when handling A's. This was the example you gave with Pyramid, Twisted and zope.interface being A, B and C respectively.

pip require has similarities.

It does?

Yes, in that it also does not affect already-installed packages. But on reviewing this, they are more different than similar. This option is more along the lines of --avoid-installed. I don't know why I thought they were similar enough to merge...

@pfmoore
Copy link
Member

pfmoore commented Jun 9, 2016

@njsmith

No, right now if foo is already installed then pip install foo does nothing and exits successfully

What I see is

>pip install xlrd
Requirement already satisfied (use --upgrade to upgrade): xlrd in c:\users\uk03306\appdata\local\programs\python\python35\lib\site-packages

I'm not sure about the exit status, I was thinking about the user experience. Apologies, I was being sloppy in my wording - I meant that I "get an error message" (maybe it's technically a warning) rather than that pip sets the exit code to error. But either way it's a minor point.

Responding to other emails:

I agree with @njsmith that deprecation is in many ways just as bad an experience for users as a sudden change. In this case I remain in favour of just going straight to the improved version. There's been plenty of debate on the tracker, and lots of people have noted their interest in seeing the new approach land. @pradyunsg if you still feel that we should warn users, then by all means post on distutils-sig (and even python-list if you feel it's warranted) and announce the plan there. There's a risk that doing so results in even more bikeshedding and debate, which may or may not be productive, but that's the nature of packaging changes :-)

I'm also in agreement that I don't see "Pyramid and all its dependencies" as a particularly useful thing to want to upgrade. Pyramid itself, of course. And Pyramid and selected dependencies, quite possibly. And certainly "everything in this virtualenv (which was set up for my Pyramid development)".

Which prompts the thought - how often would people asking for eager upgrades be better served by using virtualenvs and upgrade-all? I can't speak for other people's workflows, but it's certainly how I tend to operate. And of course for many environments, pip freeze and exact version restrictions are the norm, so eager updates would be inappropriate there.

Finally, we've decoupled "pip needs a solver" from this proposal - so arguing that eager is useful once we have a solver isn't relevant right now. Current eager behaviour can break dependencies - so we should remove it, and then maybe reintroduce a working version once we have a solver and we've had feedback that (a not-broken version of) the feature is useful to people.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 9, 2016

if you still feel that we should warn users, then by all means post on distutils-sig (and even python-list if you feel it's warranted) and announce the plan there.

I think announcing on distutils-sig sounds fine to me. python-list, I'll think about it.

There's a risk that doing so results in even more bikeshedding and debate, which may or may not be productive, but that's the nature of packaging changes :-)

That's a trade-off. I guess I'll redirect them to the PR for the bikeshedding and take other comments on the mailing list...

Quick correction: I really should have mentioned the entire help-text of --target.

""" Install packages into

. By default this will not replace existing files/folders in . Use --upgrade to replace existing packages in with new versions. """

If we are making --upgrade a no-op, --target should not depend on it. We need to figure this out.

Finally, we've decoupled "pip needs a solver" from this proposal - so arguing that eager is useful once we have a solver isn't relevant right now. Current eager behaviour can break dependencies - so we should remove it, and then maybe reintroduce a working version once we have a solver and we've had feedback that (a not-broken version of) the feature is useful to people.

Sounds good to me. I guess we can drop the eager upgrade behavior. It's easy to add it if we need to. Removing it (after the switch), not so much. I do think not providing it and advocating use of virtualenv for the job is a good idea.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 9, 2016

@pfmoore I take it that you wish to go down the no-deprecation path.

I'm also in agreement that I don't see "Pyramid and all its dependencies" as a particularly useful thing to want to upgrade. Pyramid itself, of course. And Pyramid and selected dependencies, quite possibly.

When you put it that way, it makes sense why what I was saying is not ideal.

Current eager behaviour can break dependencies

I think any package change has the potential to. The non-eager behavior just reduces the number of changes and thus works around this issue fairly well enough to reduce breakages substantially.

Anyway, I take it that it's decided that eager upgrades would be dropped.

We need to figure this out.

Maybe reuse --force-reinstall? I don't know enough about these options to be sure...


@dstufft I'm waiting for your views on deprecation vs no-deprecation.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 9, 2016

So, that leaves us with --upgrade and --target only. (and @dstufft's vote)

I request anyone with any issues/requirements, that they feel haven't been handled, to bring them up now. Not that it's the last chance or anything, just a good time to do so.

@pfmoore
Copy link
Member

pfmoore commented Jun 9, 2016

Current eager behaviour can break dependencies
I think any package change has the potential to.

Specifically current eager behaviour can leave the system in a state where declared dependency requirements (which aren't inconsistent, or otherwise broken) are violated when they were not previously. That is not acceptable, and is what a "proper solver" should address. For the simpler "only as needed" upgrades, my understanding is that the risk of such breakage is minimised even without a solver.

So, that leaves us with --upgrade and --target only.

Apart from changing the help text of --target to not refer to --upgrade, I consider --target to be out of scope here. The help text is

Install packages into <dir>. By default this will not replace existing files/folders in <dir>. Use --upgrade to replace existing packages in <dir> with new versions.

I propose we just replace it with

Install packages into <dir>.

Presumably the default will change (as with normal "install") to overwrite by default, and if you don't want to overwrite, you just don't run the install command (same as if you're installing into site-packages). If users want anything more complex, they can work out the appropriate commands, let's not worry about trying to offer suggestions (that may or may not be helpful in practice).

@xavfernandez
Copy link
Member

I think whatever the chosen solution is, we'll have to provide an option to enable the old behavior to smooth the transition.
If we go with an upgrading pip install foo we'll need a --no-upgrade.
If we remove the recursive behavior of upgrade we'll need a --recursive.

@dstufft
Copy link
Member

dstufft commented Jul 21, 2016

I agree with a --no-upgrade flag for sure (and keep the --upgrade flag to enable it to be turned back to upgrade if someone has disabled it). I'm not sure about the --recursive flag long term, but I'm not dead set against it.

@dholth
Copy link
Member

dholth commented Jul 21, 2016

I think I would experience major breakage by this change, but principally in non-interactive pip invocations, while 'pip install -U' is something that would typically happen interactively, and crucially when someone is doing development work and is available to deal with the consequences. That's why I jokingly suggested we could check isatty() to choose between one behavior or the other. But is there a way to measure either amount of time or is it just a circular volley of opinions? My opinion is that as an experienced person I do have to re-type install with -U, but it is quick, while fixing a virtualenv when I was least expecting it is hundreds of times slower.

Another solution that has already been discussed is to give the new behavior a new name (an easier name to remember than pip install -U?), and educate people on the new best-er practice; if the n00bs who in theory have the most trouble are reading the new documentation and using the new name, problem solved.

While we're on the subject, where is the 'pip rollback' command? Before and after each invocation pip should store the versions and perhaps the wheels of every installed package in a log along with a timestamp. Then if there is a problem you can just go backwards, no fuss.

Yes, I'm also aware that some set of current best practices, which are more work, could also solve some of these same problems, but one person's best practices are just another person's unnecessary extra work.

@dstufft
Copy link
Member

dstufft commented Jul 21, 2016

The fact that the breakage will be principally in non-interactive pip invocations is actually a good point, but I think more to the fact we should do it. The primary place where the current behavior makes sense is in when scripting using pip, and when you're scripting adding an extra flag to the command is no great burden, however when you're running pip interactively the default option should be the option that you're most likely going to want.

If you want a rollback command I suggest another issue for it.

@njsmith
Copy link
Member

njsmith commented Jul 21, 2016

Just in terms of UI bikeshedding, I still like the idea of the idempotent/scripting-oriented behavior getting a new verb, like pip require numpy -- to me that does a good job of capturing the conceptual difference (while pip's thicket of flags is super-confusing and their interaction hard to predict), and when scripting IME it's easier to remember to use the verb that means what you want than it is to remember to consistently pass some extra flag every time.

But I think the verb that we teach users first (which is install) should be the verb whose defaults are oriented towards new user needs, meaning interactive use, interpreting pip install django as meaning pip install django==$LATEST, etc.

@pfmoore
Copy link
Member

pfmoore commented Jul 22, 2016

But is there a way to measure either amount of time or is it just a circular volley of opinions?

This is precisely my point. I don't think there's any compelling (as in, likely to convince the other camp) arguments for either side. And in that case, the status quo wins. My biggest concern here is that we don't (as a project) have a good means of arbitrating this type of situation, and we end up with this hovering over us forever, because there's always the possibility that someone could commit a PR, simply because those who objected the previous time didn't notice a discussion being reopened. What we need is some sort of equivalent of Python's rejected PEPs, which would allow us to say "we've decided (for the following reasons) to do nothing" and then be able to shortcut he process of someone asking to revisit the decision and having to go through all the old fruitless arguments again.

I'd rather find a way to make the current non-default behaviour more easily accessible for people that need it, than waste time in arguments that will simply result in both sides becoming more and more entrenched in their positions. Although I don't really know how to do that - I really don't understand what's so confusing about "install installs, install --upgrade installs but also upgrades if needed".

But I think the verb that we teach users first (which is install) should be the verb whose defaults are oriented towards new user needs, meaning interactive use, interpreting pip install django as meaning pip install django==$LATEST, etc.

Well, while I see your point that we should view interactive use (by new users) as the prime use case, I'm not convinced that implies "install or upgrade". I'd argue that the failure mode of an implied upgrade (you upgrade an existing install without meaning to, and break another part of your system by doing so) is sufficiently bad (even for an experienced user) that it warrants a flag to say "I understand the consequences".

Project instructions saying "use pip install FOO and you're good to go" can (and should) be changed. We shouldn't be driving a decision like this based on other people's erroneous documentation, no matter how much of it there is. The wording should just be "If you don't already have FOO, use pip install FOO and you're good to go. If you have FOO already but want the new version, use pip install --upgrade FOO".

I know there's anecdotal evidence of people spending lots of time trying to work out what went wrong because they didn't include --upgrade. But how are we supposed to evaluate that, given that it's (by definition) impossible to get evidence of how many people have not had any issue with the current behaviour? Make a change and wait for bug reports from people saying "I did pip install foo and it upgraded my existing foo, which broke bar - how do I unpick this mess?" Personally, I don't want to have to support people in that situation...

@dholth
Copy link
Member

dholth commented Jul 22, 2016

Mercurial measures by getting usage stats from Facebook, they have a special corporate plugin to record them.

@dstufft
Copy link
Member

dstufft commented Jul 22, 2016

I really don't understand what's so confusing about "install installs, install --upgrade installs but also upgrades if needed".

It's not really that it's confusing at a high level, but that the default behavior requires you to know what's already installed on your system in order to figure out what the outcome of the command is going to be. It's easy to not realize, particularly with virtual environments, what exactly you have installed and assume that you don't have something installed (and then get confused when you're not getting the version you expect).

I have, at any one time, something like 50-100 different virtual environments on my personal computer, one for each project I work on. It's basically impossible for me to know what's installed into a particular environment without sitting there and hitting pip list and then going through the entire list which takes way more time than I'm ever going to do.

We shouldn't be driving a decision like this based on other people's erroneous documentation, no matter how much of it there is.

I don't think this is entirely true. Neither option is objectively correct so we're lefting to trying to figure out a subjective answer of what is better, and looking at what mistakes other people made in their documentation is not a bad source of information. To use an extreme example, if literally everyone was doing it the wrong way, than that would suggest that the wrong way is too obvious and the right way isn't obvious enough.

But how are we supposed to evaluate that, given that it's (by definition) impossible to get evidence of how many people have not had any issue with the current behaviour?

Metrics in OSS is a problem :( At some point it'd be great if we can get some so we can see things like "this person ran install and then nothing else" compared to "this person ran install, then almost immediately re-ran it with --upgrade". Unfortunately that's still in the "gee it'd be nice" phase and not anywhere near being done so we're left with throwing chicken bones and trying to divine reality from imagination.

@pfmoore
Copy link
Member

pfmoore commented Jul 22, 2016

the default behavior requires you to know what's already installed on your system

While I appreciate that this might be an issue, I'm not really convinced it's that major of a problem. After all, if you do pip install and the package is already present, you get immediate feedback:

(x) C:\Work\Scratch>pip install wheel
Requirement already satisfied (use --upgrade to upgrade): wheel in c:\work\scratch\x\lib\site-packages

So it's not like it's going to take you forever to find out that you need to upgrade, or how to do so.

I'd rather have a safe default with the system able to detect that you may have meant the alternative (plus a clear message telling you what to do) over a default with the potential to break unrelated stuff, and no recovery mechanism.

The more we have this discussion the less I understand the advantage of upgrade as default.

@dstufft
Copy link
Member

dstufft commented Jul 22, 2016

After all, if you do pip install and the package is already present, you get immediate feedback.

Sort of, though it's easy for that to get drowned out in all of the other output with even a moderate amount of packages being installed:

$ pip install Pyramid
Collecting Pyramid
  Using cached pyramid-1.7-py2.py3-none-any.whl
Collecting WebOb>=1.3.1 (from Pyramid)
  Using cached WebOb-1.6.1-py2.py3-none-any.whl
Collecting translationstring>=0.4 (from Pyramid)
  Using cached translationstring-1.3-py2.py3-none-any.whl
Collecting zope.deprecation>=3.5.0 (from Pyramid)
Collecting venusian>=1.0a3 (from Pyramid)
Requirement already satisfied (use --upgrade to upgrade): setuptools in ./lib/python3.5/site-packages (from Pyramid)
Collecting PasteDeploy>=1.5.0 (from Pyramid)
  Using cached PasteDeploy-1.5.2-py2.py3-none-any.whl
Collecting repoze.lru>=0.4 (from Pyramid)
Requirement already satisfied (use --upgrade to upgrade): zope.interface>=3.8.0 in ./lib/python3.5/site-packages (from Pyramid)
Installing collected packages: WebOb, translationstring, zope.deprecation, venusian, PasteDeploy, repoze.lru, Pyramid
Successfully installed PasteDeploy-1.5.2 Pyramid-1.7 WebOb-1.6.1 repoze.lru-0.6 translationstring-1.3 venusian-1.0 zope.deprecation-4.1.2

I'd rather have a safe default with the system able to detect that you may have meant the alternative (plus a clear message telling you what to do) over a default with the potential to break unrelated stuff, and no recovery mechanism.

See, I don't think upgrade-by-default is unsafe at all (when you take into account the other change to upgrade). I find more software that doesn't work with whatever older version of something I had installed with than I do software that doesn't work with a newer version. I already know what versions might be getting installed, because I named them explicitly on the command line, so we're not upgrading things that I didn't explicitly call out.

@pfmoore
Copy link
Member

pfmoore commented Jul 22, 2016

Sort of, though it's easy for that to get drowned out in all of the other output with even a moderate amount of packages being installed:

Good point, maybe it should be highlighted (we use colours for things like warnings, this seems like a good candidate).

See, I don't think upgrade-by-default is unsafe at all

Well, suppose you have foo 1.0 installed, and bar 1.0 that depends on foo. Suppose bar works with foo 1.0 but not foo 2.0 (but the dependency is just on "foo", not "foo < 2.0" because foo 2.0 wasn't out when bar 1.0 was released, and how was the author to know?) Now if I do pip install --upgrade foo, bar breaks. And I may not even find out that bar is broken for a long time, if it's not something I use a lot. That's not a failure mode I want to have to deal with as the default behaviour - even if it's rare, and even if it's arguably bar's fault for not being more strict in its dependencies.

I don't want to turn this into an exercise in "my failure scenario is worse than yours", as that tends to make a debate way too heated (see a typical security discussion) but I do think that "not unsafe at all" is wrong - at best it's "unlikely to cause an issue".

Of course, you mention "the other change to upgrade" here. There's way too many combinations of things being proposed and becoming dependencies of one another (upgrade, upgrade all, rollback, recursive upgrade, non-eager upgrading, ...). Maybe we should take things one step at a time - why not leave this discussion for now, and focus on getting "safe upgrade" in place. Once we have pip install --upgrade in a place where we can guarantee it won't ever break someone's system, maybe we can reopen the debate on the default behaviour then?

@pradyunsg
Copy link
Member Author

I don't want to turn this into an exercise in "my failure scenario is worse than yours"

Exactly! Let's not.

There's way too many combinations of things being proposed and becoming dependencies of one another (upgrade, upgrade all, rollback, recursive upgrade, non-eager upgrading, ...). Maybe we should take things one step at a time - why not leave this discussion for now, and focus on getting "safe upgrade" in place.

The "other change" is the switch to non-eager upgrades. I feel, this issue deals only with change in behaviour of install and install --upgrade and thus, it should involve discussion on upgrade strategies. Everything else (upgrade-all, rollback), we explicitly decoupled when we opened this issue.

Once we have pip install --upgrade in a place where we can guarantee it won't ever break someone's system, maybe we can reopen the debate on the default behaviour then?

This would mean resolving #988 first which has been stuck for a fairly long amount of time.


I feel we've been bikeshedding and speculating what the user would do for too long. I feel it's no longer reasonable to do that without some metrics which are hard to get reliably. I saw this change as a quick-fix that provided a good middle ground until #988 landed. It's definitely not been quick and it's been debatable if it's a good middle ground. I think it might be worth it to take a step back.


It's already possible to do non-eager upgrades if you want but to figure that out it takes a google search, which is more difficult than it should be. Even if pip provides an option on install to do non-eager upgrades, it'll be better that status-quo.

Also, I don't think anyone wants the current "eager upgrade" default to be the default. If that's not the case, I must have missed it. So, why not switch to non-eager upgrades by default? As long as we switch the default upgrade strategy to be non-eager and maybe provide a way to do eager upgrades, we'll be better off than status-quo.

So, assuming that no one is opposed to these two points, a minimum disruption change would be:

  • pip install --upgrade provides non-eager upgrades by default.
  • Add a --upgrade-strategy=[eager/non-eager] (with any spelling) to choose your upgrade strategy, iff you really want to provide eager upgrades.

How does this sound?

@pradyunsg
Copy link
Member Author

pradyunsg commented Jul 23, 2016

Adding to what I just commented, this is what I feel is the path of least resistance, to get the non-recursive default behaviour through which I would like to see get through.

I feel, making install upgrade by default is essentially a separate discussion. It is something worth discussing but I feel that it shouldn't hold up the change in upgrade-strategy.

(I feel like this would mean a new issue for discussing what I just proposed but I'll take the first-opinions here before doing that)

@pfmoore
Copy link
Member

pfmoore commented Jul 23, 2016

Just to clarify - I don't believe that non-eager upgrades fix the issue that "pip install --upgrade foo" could upgrade foo from 1.0 to 2.0, but an already-installed bar might declare a dependency on foo (with no version) but not work with 2.0? I can't see that it could (or indeed that it should) and yet that's the scenario that bothers me about making upgrade the default.

Which isn't to imply that I have a problem with your proposal to get non-eager upgrades in place as the first step (I'm +1 on that regardless).

@njsmith
Copy link
Member

njsmith commented Jul 24, 2016

I'd argue that the failure mode of an implied upgrade (you upgrade an existing install without meaning to, and break another part of your system by doing so) is sufficiently bad (even for an experienced user) that it warrants a flag to say "I understand the consequences".

This is literally the failure mode of every single thing that pip does. It's also the failure mode of not running pip (e.g. the publication of a security exploit that targets your current stack will cause it to go from working -> broken without you changing your environment at all). People who run pip are explicitly requesting that whatever change they have specified be made to their environment, with all the risks and benefits that entails.

No-one runs pip install foo when they already know that foo is installed, because that would be silly. So users already have to be prepared for this to break their environment, because installing new packages (and pulling in their arbitrary transitive dependencies) is just as dangerous as upgrading existing packages. In fact, upgrading foo and installing foo are exactly as dangerous, because they do exactly the same thing -- they pull in the exact same versions of the exact same packages.

The argument that a plain pip install numpy should be interpreted as pip install numpy==$LATEST is that this is much simpler and predictable than the current thing (where pip install numpy is interpreted as pip install numpy==$CURRENTLY_INSTALLED_VERSION unless there is no currently installed version, in which case it's interpreted as pip install numpy==$LATEST -- just look how much longer that took to write). It reduces the state space that the user has to keep track of -- I can't see how reducing the possible outcomes of a command to a strict subset of what they used to be makes it more dangerous :-).

It's also has the important benefit that actually reduces the proliferation of paths through the pip internals -- having separate options for every little thing, no matter how use(ful/less), has a very substantial cost for maintainers, and is how pip became the "Rube Goldberg machine of sadness" described in the #pypa-dev topic.

Project instructions saying "use pip install FOO and you're good to go" can (and should) be changed.

I find it difficult to believe that you would put up with this argument if we were talking about a library API :-(. "Yes, many users of this API function call it with the default values, and yes, those work 95% of the time so that most users don't realize that their code is broken in the other 5% of cases. The solution is to keep that API the way it is and file bug reports forever telling everyone to add the unbreakme=True kwarg to every call. Because this is totally the user's fault."

we end up with this hovering over us forever, because there's always the possibility that someone could commit a PR, simply because those who objected the previous time didn't notice a discussion being reopened.

That's not how it does work, though -- notice that this change had extensive discussion on github and then there was a mailing list heads-up to make sure that no-one was surprised.

I actually kind of wish this is how it worked, because this change would be fait accompli and we could all move on and stop wasting time on this ;-). And jokes aside, it might actually be healthier for the project if someone like dstufft decided to play BDFL in situations like right. Right now the de facto outcome is that changes are just impossible, and I'm starting to feel like it would be more productive to give up on trying to improve pip, and instead put my energy/recommend others put their energy into figuring out to make a viable pip fork :-(

@pradyunsg
Copy link
Member Author

Just to clarify - I don't believe that non-eager upgrades fix the issue that "pip install --upgrade foo" could upgrade foo from 1.0 to 2.0, but an already-installed bar might declare a dependency on foo (with no version) but not work with 2.0?

FWIW, I don't think it works even if bar explicitly depends on foo==1.0.

/tmp/pip-testing
$ ls ./repo
bar-1.0.tar.gz  foo-1.0.tar.gz  foo-2.0.tar.gz

/tmp/pip-testing
$ pip install --find-links ./

/tmp/pip-testing
$ pip install --find-links ./repo bar
Collecting bar
Collecting foo==1.0 (from bar)
Building wheels for collected packages: bar, foo
  Running setup.py bdist_wheel for bar ... done
  Stored in directory: /home/pradyunsg/.cache/pip/wheels/20/cd/44/f59790040978a7eb9989ce680e85681c252516bd7fc9baf059
  Running setup.py bdist_wheel for foo ... done
  Stored in directory: /home/pradyunsg/.cache/pip/wheels/27/9a/5f/3e8efff98718d38adb7cf6b20e4435694e8c465085792441be
Successfully built bar foo
Installing collected packages: foo, bar
Successfully installed bar-1.0 foo-1.0

/tmp/pip-testing
$ pip install --upgrade foo
Requirement already up-to-date: foo in /home/pradyunsg/.venvwrap/venvs/tmp-734de48113851ca/lib/python3.5/site-packages

/tmp/pip-testing
$ pip install --find-links ./repo --upgrade foo
Collecting foo
Building wheels for collected packages: foo
  Running setup.py bdist_wheel for foo ... done
  Stored in directory: /home/pradyunsg/.cache/pip/wheels/9d/3e/ce/b183a52b3e6844394d6cbf5606acadf8c340d48ccfcf02cc1c
Successfully built foo
Installing collected packages: foo
  Found existing installation: foo 1.0
    Uninstalling foo-1.0:
      Successfully uninstalled foo-1.0
Successfully installed foo-2.0

/tmp/pip-testing
$ pip list
bar (1.0)
foo (2.0)
pip (8.1.2)
setuptools (25.0.0)
wheel (0.29.0)

/tmp/pip-testing
$ pip --version
pip 8.1.2 from /home/pradyunsg/.venvwrap/venvs/tmp-734de48113851ca/lib/python3.5/site-packages (python 3.5)

@njsmith
Copy link
Member

njsmith commented Jul 24, 2016

I don't think it works even if bar explicitly depends on foo==1.0.

Right, this is the "pip needs a real resolver" bug, which will get fixed eventually but is a big task so we don't want it to block other things if at all avoidable.

OTOH the case where bar uses an unversioned dependency on foo is basically impossible to get right AFAICT, so I'm not sure what it has to do with anything. The only solution for that is "never touch your venv ever again", and even that isn't guaranteed (because of things like new security holes or changes in external APIs that you need to talk to).

@pfmoore
Copy link
Member

pfmoore commented Jul 24, 2016

In fact, upgrading foo and installing foo are exactly as dangerous, because they do exactly the same thing -- they pull in the exact same versions of the exact same packages.

OK. We really are simply going to have to agree to disagree on this. In my view, it's about the user's perception - "installing foo" is adding something previously not present to your system, whereas "upgrading foo" is changing something that's already there. To the user, these are far from being the same thing.

Right now the de facto outcome is that changes are just impossible

OK, I give up. I don't believe there's consensus on this change, and I think it's wrong to implement it without consensus. You know I don't agree with it myself, but that's not the point here. Changes really aren't impossible (we've made plenty of changes, some pretty controversial) but neither side in this argument seems able to convince the other. In my view, that typically results in the status quo winning - but I'm aware that by saying that I'm going to be perceived as implying that "all I have to do to get my way is stall things". IMO, it says something about where we are at the moment that I feel that way :-(

I'm bowing out of this discussion now. If anyone makes an argument that changes my mind, I'll acknowledge that, but otherwise I have nothing more to say. If I'm the last holdout for not making this change, I give my permission to everyone to ignore me - I certainly don't feel that I (or anyone) should have a veto over changes, and I'm completely comfortable accepting a majority decision. If others do still have reservations about this change, they'll have to make their own arguments (but I'd remind participants that not everyone reads github issues - in spite of the discussion going off-track, there were some comments on distutils-sig that IMO deserve a response).

@dstufft
Copy link
Member

dstufft commented Jul 24, 2016

And jokes aside, it might actually be healthier for the project if someone like dstufft decided to play BDFL in situations like right. Right now the de facto outcome is that changes are just impossible

While I don't think that our current process is optimal I don't think it's quite as bad as "changes are impossible". Generally we previously would do something like bring something up on pypa-dev ML with a simple majority vote amongst pip core in cases that there wasn't a clear consensus. I think there are three active pip core devs now (Myself, @pfmoore, and @xavfernandez) so if all three of us vote you end up with a vote one way or another instead of a tie. Could we use a more formalized process? Yes probably. Could that be a BDFL role? Possibly, but I don't think that's required either.

Sadly, the current ad hoc process typically means that one of the core contributors needs to sit down and decide to push for the change and say "Ok let's vote on this" and declare some ad hoc rules for doing so.

Recall, there was agreeing on changing the behavior of --upgrade, which is the major thing that was preventing things like projects depending on Numpy to declare their dependency. This particular change is jsut an idea that came out of that and is more of a UX thing than anything else. Like any project, unless you're the BDFL there are going to be times when the decision making process goes against the option you want. I haven't done any of this because I've been focusing on Warehouse lately, pip will be coming back in my cross hairs after that's launched :)

@xavfernandez
Copy link
Member

No-one runs pip install foo when they already know that foo is installed, because that would be silly.

I'd guess you're right, but I'd also say a lot of people are running pip install -r requirements.txt with requirements.txt containing foo (which is equivalent to pip install foo) even though they know that foo is installed on a daily basis.
And they are happy with the fact that pip does it quickly without checking if there is something to upgrade.

I'm not against the idea that pip install foo could be equivalent to pip install foo==$LATEST (in fact I like it) but I'm against changing this fundamental behavior without a deprecation period (and an escape option to keep the old behavior).
I'm not sure we have discussed this solution already, but this could be a new --strategy option to pip install:

  • no-upgrade would be the default in pip 9 and pip install --strategy=no-upgrade would be the current pip install behavior
  • eager would be the current pip install --upgrade behavior (and --upgrade a deprecated alias for --strategy=eager)
  • non-eager would be the default in pip 10
  • we could imagine a oldest-compatible for pip always uses the highest compatible version #3188, etc

Note that you could also put strategy=non-eager in your pip.conf to directly have it being the default in pip 9.

Could we use a more formalized process? Yes probably.

👍

@dstufft
Copy link
Member

dstufft commented Jul 25, 2016

And they are happy with the fact that pip does it quickly without checking if there is something to upgrade.

It'd probably still be pretty quick TBH. We serve responses in less than a ms from the Fastly cache :)

I'm not against the idea that pip install foo could be equivalent to pip install foo==$LATEST (in fact I like it) but I'm against changing this fundamental behavior without a deprecation period (and an escape option to keep the old behavior).

I'm fine with a deprecation period. I'm not sure about a long term option to keep the old behavior. I'm not opposed, I just want to make sure that it's something we really should support long term, options in general coming with a cost, and wanting to make sure the cost is worth it.

@pradyunsg
Copy link
Member Author

How does this sound?

Followed up with #3972.

@pradyunsg
Copy link
Member Author

Closing since #3972 is merged.

We have taken a different path to resolving the behaviour of --upgrade.

@pradyunsg pradyunsg added the C: upgrade The logic of upgrading packages label May 11, 2018
@lock
Copy link

lock bot commented Jun 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 2, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: upgrade The logic of upgrading packages
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants