Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install of a directory is super slow #2195

Closed
msabramo opened this issue Dec 15, 2014 · 77 comments
Closed

pip install of a directory is super slow #2195

msabramo opened this issue Dec 15, 2014 · 77 comments
Labels
state: needs discussion This needs some more discussion type: enhancement Improvements to functionality

Comments

@msabramo
Copy link
Contributor

msabramo commented Dec 15, 2014

See #2195 (comment), for a summary of this issue.


I am dubious of why pip needs 17 seconds to process a local directory that is not on NFS (in fact, it's on an SSD drive) for pip, which has no dependencies, since everything is vendored.

$ time pip install --no-install ~/dev/git-repos/pip
DEPRECATION: --no-install and --no-download are deprecated. See https://github.com/pypa/pip/issues/906.
Processing /Users/marca/dev/git-repos/pip
  Requirement already satisfied (use --upgrade to upgrade): pip==6.0.dev1 from file:///Users/marca/dev/git-repos/pip in /Users/marca/dev/git-repos/pip
pip install --no-install ~/dev/git-repos/pip  2.80s user 5.86s system 50% cpu 17.205 total

It should probably at least be logging whatever is taking that long, but maybe it shouldn't even be doing whatever it's doing.

Note that the "Processing" line appears right away and pretty much the whole delay seems to be between that line and the next one.

@tomprince
Copy link

It is making a copy of the entire directory, including .git. It probably shouldn't be doing that, no.

@msabramo
Copy link
Contributor Author

$ du -sh pip
263M    pip
$ du -sk * .cache .git .tox .travis | sort -nr | head -n 5
181860  .tox
34836   tests
31700   .git
9212    pip
2852    build

@msabramo
Copy link
Contributor Author

I tried passing 3 -v's (time pip install -vvv --no-install ~/dev/git-repos/pip) -- that didn't yield any more info.

@msabramo
Copy link
Contributor Author

Stepping through it with pdb, things slow down when I get to:

> /Users/marca/dev/git-repos/pip/pip/req/req_set.py(365)prepare_files()
-> unpack_url(

@msabramo
Copy link
Contributor Author

And yep, @tomprince is right - it slows down when it does a copy of the whole tree:

> /Users/marca/dev/git-repos/pip/pip/download.py(635)unpack_file_url()
-> shutil.copytree(link_path, location, symlinks=True)

@msabramo
Copy link
Contributor Author

$ time pip install --no-install ~/dev/git-repos/pip
DEPRECATION: --no-install and --no-download are deprecated. See https://github.com/pypa/pip/issues/906.
Processing /Users/marca/dev/git-repos/pip
  2014-12-15 15:23:34.630794: Copying tree; link_path = '/Users/marca/dev/git-repos/pip'; location = '/var/folders/gw/w0clrs515zx9x_55zgtpv4mm0000gp/T/pip-D6etc4-build'
  2014-12-15 15:23:57.418679: DONE copying tree; link_path = '/Users/marca/dev/git-repos/pip'; location = '/var/folders/gw/w0clrs515zx9x_55zgtpv4mm0000gp/T/pip-D6etc4-build'
  Requirement already satisfied (use --upgrade to upgrade): pip==6.0.dev1 from file:///Users/marca/dev/git-repos/pip in /Users/marca/dev/git-repos/pip
pip install --no-install ~/dev/git-repos/pip  2.75s user 5.03s system 32% cpu 24.168 total
>>> elapsed time 24s

@msabramo
Copy link
Contributor Author

Some discussion in #2196

msabramo added a commit to msabramo/pip that referenced this issue Dec 16, 2014
by ignoring .tox, .git, .hg, .bzr, and .svn  when doing
`shutil.copytree` in unpack_file_url in pip/download.py.

Fixes: pypaGH-2195
@msabramo
Copy link
Contributor Author

It's much faster now that #2196 is merged.

@msabramo
Copy link
Contributor Author

This should be reopened since #2196 was reverted. I'd like to come with an alternative PR that builds an sdist instead of using heuristics to figure out what to copy. See the comments on that PR for details.

@msabramo msabramo reopened this Mar 13, 2015
@msabramo
Copy link
Contributor Author

$ time pip install --no-install ~/dev/git-repos/pip
DEPRECATION: --no-install and --no-download are deprecated. See https://github.com/pypa/pip/issues/906.
Processing /Users/marca/dev/git-repos/pip
  Requirement already satisfied (use --upgrade to upgrade): pip==6.1.0.dev0 from file:///Users/marca/dev/git-repos/pip in /Users/marca/dev/git-repos/pip
pip install --no-install ~/dev/git-repos/pip  3.67s user 8.12s system 7% cpu 2:45.83 total
>>> elapsed time 2m46s

Yikes, almost 3 minutes.

Probably mostly due to this:

$ du -sh .tox
177M    .tox

The .tox directory is 177M out of a total 270M for my whole pip directory.

msabramo added a commit to msabramo/pip that referenced this issue Mar 13, 2015
Right now it's just a pretty simple `shutil.copytree`, but ideally we want
it to do something more complex, involving building an sdist.

Plus, this makes `unpack_file_url` fit on a single screen without
scrolling for me. :-)

See pypa#2195
msabramo added a commit to msabramo/pip that referenced this issue Mar 13, 2015
E.g.: `pip install /path/to/dir`

by building an sdist and then unpacking it instead of doing
`shutil.copytree`. `shutil.copytree` may copy files that aren't included
in the sdist, which means that:

1. If those files are large, a `pip install` could take much longer than
it should.

2. Folks that are using `python setup.py install` to test their package
are not fully testing it, because it may copy over files that wouldn't
be there if they built and sdist and installed that.

So this method building an sdist is both more accurate and faster.

Fixes pypa#2195
@msabramo
Copy link
Contributor Author

Please see #2535, which speeds up unpack_file_url by building an sdist and unpacking it.

@rgommers
Copy link

rgommers commented Nov 1, 2015

This issue should be reopened, because the merged PR did nothing (see gh-3219).

@xavfernandez xavfernandez reopened this Nov 2, 2015
rgommers added a commit to rgommers/pip that referenced this issue Nov 3, 2015
This is a follow-up to pypagh-2535, which added the code to copy via
(sdist + unpack) instead of shutil.copytree, but forgot to actually
call that function.

Fixes pypagh-2195 (slow pip install of a dir).
warner added a commit to warner/pip that referenced this issue Apr 13, 2016
This works around a problem with the new sdist-based "pip install .":

* when creating the sdist, we don't run a literal "setup.py sdist"
* instead, sys.argv[0] is a complicated shim that injects
  setuptools even into distutils-based projects
* as a result, distutils.command.sdist.add_defaults() doesn't realize
  that "setup.py" is the name of its setup script (it gets confused
  because sys.argv[0] is not a real file).
* so add_defaults() doesn't include setup.py in the generated
  tarball. (projects could add "include setup.py" to their MANIFEST.in,
  but this is not common practice because usually it's automatic)
* so the unpacked sdist (from which pip will make a wheel) lacks the
  critical setup.py

This copies the setup.py from source tree to unpacked target tree.

The patch also removes a performance comment that was obsoleted by
switching to _copy_dist_from_dir().

refs pypa#2195, pypa#2535, pypa#3176
@sbidoul
Copy link
Member

sbidoul commented Apr 13, 2020

This should be resolved by #7882 (build local directories in place).

@sbidoul sbidoul closed this as completed Apr 13, 2020
@pradyunsg
Copy link
Member

We have now (per #7951) published a beta release of pip, pip 20.1b1. This release includes #7882, which implemented a solution for this issue.

I hope participants in this issue will help us by testing the beta and checking for new bugs. We'd like to identify and iron out any potential issues before the main 20.1 release on Tuesday.

I also welcome positive feedback along the lines of "yay, it works better now!" as well, since the issue tracker is usually full of "issues". :)

@PythonCoderAS
Copy link

I will say that it is considerably better.

Old: noglob pip3 install . 3.76s user 2.51s system 12% cpu 50.245 total

New: noglob pip3 install . 3.40s user 0.70s system 42% cpu 9.764 total

@astrofrog
Copy link

astrofrog commented Apr 25, 2020

Works great/faster for me! 👍

@klamann
Copy link

klamann commented Apr 26, 2020

» pip --version
pip 20.0.2 
» time pip install .
noglob pip install .  8.03s user 18.47s system 25% cpu 1:44.84 total
» pip --version
pip 20.1b1 
» time pip install .
noglob pip install .  3.69s user 0.31s system 92% cpu 4.307 total

down from ~2 minutes to 4 seconds, thank you so much!

@pradyunsg
Copy link
Member

Thank you for the positive reports @PythonCoderAS @astrofrog @klamann! :)

Unfortunately, there have been a number of issues with the implementation of in-place builds (which are being tracked under #7555) which means that for now, we need to revert #7882. As a result, this issue will become a problem again, and we'll therefore be reopening it. Longer-term, we hope to have a solution that addresses the issues that in-place builds solved, but without the impact on other workflows that the current solution had.

Sorry for the disruption that this will cause.

@pradyunsg pradyunsg reopened this May 14, 2020
@rgommers
Copy link

Unfortunately, there have been a number of issues with the implementation of in-place builds

@pradyunsg thanks for the update. Some feedback on terminology (please feel free to ignore, just FYI): this sentence, as well as gh-7555, confused me because pip does not do in-place builds. What in-place builds has always meant is python setup.py build_ext --inplace (or python setup.py develop).

Here you changed the meaning to: "build without copying to a tmpdir". Extension modules still don't end up in-place, they end up in a build/ dir that's usually easily cleaned up. It would be nice to be a little more explicit in for example gh-7555.

@pfmoore
Copy link
Member

pfmoore commented May 14, 2020

That was originally my wording. Sorry for any confusion, I wasn't aware that setuptools used the term "in place" to mean something different (and I'm still not really sure how that terminology applies outside of setuptools). We'll see if we can find a more neutral term in future (although offhand, I'm not sure what - suggestions gratefully accepted 😉)

@rgommers
Copy link

No worries at all, thanks @pfmoore. I just thought I'd point it out, since confusion about terminology can sometimes result in talking past each other.

and I'm still not really sure how that terminology applies outside of setuptools

For tools like CMake and scikit-build I think it means the same thing: actually in-place, binaries land next to sources.

"editable installs" on the other hand is (I believe) invented here, and kinda means "in-place that pip is aware of".

although offhand, I'm not sure what - suggestions gratefully accepted

maybe just "local build" (vs. the current "copy to tmpdir and build")?

@gaborbernat
Copy link

"editable installs" on the other hand is (I believe) invented here, and kinda means "in-place that pip is aware of".

We recently had a long discussion on what editable install means, and I think we actually landed in a place that is more along the lines of machine local as far as pip goes. But pip is unaware of where and how on the local machine and is the build backends job to define and handle that.

@merwok
Copy link

merwok commented May 14, 2020

Could try «in-tree build» (similar to «in-tree PEP 517 backend») or «build in source dir»

@PythonCoderAS
Copy link

My question is, why can't the feature be optional, so it does not cause problems but can be enabled by an argument or something similar?

@salotz
Copy link

salotz commented Jul 27, 2020

I'm trying to wrap my head around the workarounds for this, where an editable install isn't an option. Is there any?

@merwok
Copy link

merwok commented Jul 28, 2020

A workaround could be to build a wheel (using your build backend directly) then point pip to install it

@pradyunsg
Copy link
Member

why can't the feature be optional, so it does not cause problems but can be enabled by an argument or something similar?

It can. The reason for reverting the change was that we didn't have any opt-outs or a period for getting feedback on the change. We do have new flags to help facilitate that (--use-feature and --deprecated-feature), but someone has to reimplement/reintroduce the functionality in this context now.

Broadly, I think what we want to do here is:

  • Add a --use-feature=in-tree-build as an opt in.
  • Switch the default in a later release w/ a --deprecated-feature=out-of-tree-build as an opt out + pushing users of --use-feature=in-tree-build to drop it.
  • Drop both of the options in a subsequent release.

@salotz
Copy link

salotz commented Jul 28, 2020

A workaround could be to build a wheel (using your build backend directly) then point pip to install it

I was thinking without an extra build step. But I guess I should have never thought Python could get away without Makefile equivalents from the beginning.

@dkbarn
Copy link

dkbarn commented Feb 22, 2021

FYI for those who are running into this issue -- A workaround is to replace pip install . with:

python setup.py bdist_wheel
pip install dist/*.whl

dedyk pushed a commit to dedyk/jmdictdb_mirror that referenced this issue Feb 26, 2021
setup.py was changed in 200625-8ea3fd9 to explicitly specify the
packages to be installed rather then using find_packages() in an
attempt to diagnose what turned out to be a setuptools problem with
excessively slow installs (pypa/pip#2195).
Didn't realize subpackages also have to be named explicitly.
@uranusjr
Copy link
Member

uranusjr commented Mar 3, 2022

Now that in-tree-build is available, should we close this?

@pradyunsg
Copy link
Member

Yes.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
state: needs discussion This needs some more discussion type: enhancement Improvements to functionality
Projects
None yet
Development

No branches or pull requests