Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give a clear recommendation on what sdists should contain #1494

Open
jeanas opened this issue Jan 27, 2024 · 17 comments
Open

Give a clear recommendation on what sdists should contain #1494

jeanas opened this issue Jan 27, 2024 · 17 comments
Labels
component: guides component: specifications state: blocked type: task Something that needs to be done that is not a bug or feature

Comments

@jeanas
Copy link
Contributor

jeanas commented Jan 27, 2024

There is currently confusion in the ecosystem around what sdists should contain, and specifically

  • whether they should contain tests and documentation
  • whether they should contain auxiliary files (e.g., maintenance scripts)

This is an issue for downstream packagers, who normally want at least the tests. If sdists should contain tests, downstream packagers should be able to use sdists from PyPI and sdists which currently do not contain tests should be changed by upstream authors. If sdists should not contain tests, packagers should migrate to using upstream sources like VCS checkouts.

There has been no real consensus so far, AFAICS. The discussions were

https://discuss.python.org/t/should-sdists-include-docs-and-tests/14578

and

https://discuss.python.org/t/sdists-for-pure-python-projects/25191/40

Once a consensus is reached, it should be reflected in the guide.

@sinoroc
Copy link
Contributor

sinoroc commented Jan 27, 2024

Maybe it is not something where a consensus can be reached. Maybe this document (guide or discussion?) could expose the different established practices with some pros and cons, and also with the (current) preferences of the various downstream packagers.

@webknjaz
Copy link
Member

Exactly. I'm in the camp of having downstream-friendly sdists and even have a GHA to facilitate that.

@jeanas
Copy link
Contributor Author

jeanas commented Jan 28, 2024

Maybe it is not something where a consensus can be reached.

Maybe, but I am personally more optimistic.

@chrysle
Copy link
Contributor

chrysle commented Jan 28, 2024

If sdists should contain tests, downstream packagers should be able to use sdists from PyPI and sdists which currently do not contain tests should be changed by upstream authors.

Of Debian packaging at least I know that it requires complete copies of the underlying project's source tree. That might not be a rare exception.

@chrysle chrysle added state: blocked type: task Something that needs to be done that is not a bug or feature component: specifications component: guides labels Jan 28, 2024
@webknjaz
Copy link
Member

Gentoo and Fedora have helpers/macros for getting sdists from the PyPI. I think Arch does too. Some others borrow helpers from other distros.
Through, I've seen exceptions of reverting to using Git in some cases.

@webknjaz
Copy link
Member

It was brought up in another discussion that it'd be nice to have backends include files in an independent manner, which is a good idea, in my opinion.

One of the plugins, setuptools-scm, does just that, for example.

@merwok
Copy link
Contributor

merwok commented Jan 28, 2024

What does it mean to include files in an independent manner?

@webknjaz
Copy link
Member

@merwok could you clarify what the question is?

@jeanas
Copy link
Contributor Author

jeanas commented Jan 28, 2024

I have the same question: what do you mean in your comment above by including files “in an independent manner”? Independent of what?

@chrysle
Copy link
Contributor

chrysle commented Jan 29, 2024

I have the same question: what do you mean in your comment above by including files “in an independent manner”?

I guess ensuring all relevant files are included in the sdist per backend is meant, in this case setuptools-scm's file_finders hook, independent from the build tool.

@webknjaz
Copy link
Member

Ah, I missed that they referred to my statement. I meant that all the backends could reuse the same logic, since there's really nothing novel sdist generators can do, short of having different configuration interfaces.
The PEP 517 hooks allow generating something before putting into sdists, if course, but maybe, that should be discouraged in favor of just including the files necessary (maybe, with an exception of projects with Git submodules, I suppose).

@webknjaz
Copy link
Member

One thing this discussion document should emphasize is that an sdist must contain everything necessary for making a wheel out of it. And with the case of the downstreams — docs, and tests, and unified workflow automation configs like tox.ini. It could also have a section outlining what things are common to omit — like the upstream CI setup, or things like GH issue forms / PR templates. Though, GHA could be useful in terms of helping the downstreams extract unobvious bits of metadata, like tested platforms and how the commands are run.

@chrysle
Copy link
Contributor

chrysle commented Jan 29, 2024

and unified workflow automation configs like tox.ini.

At Debian, tox/nose (yeah, the build system is quite ancient) are the preferred test automation tools. Of course it's also possible to customize testing, but that comes with increased effort.

One thing this discussion document should emphasize is that an sdist must contain everything necessary for making a wheel out of it.

Sad, actually Linux distributions discourage providing wheels with packages. Debian only has a few universal wheels to make pip, virtualenv and pyenv work.

@merwok
Copy link
Contributor

merwok commented Jan 29, 2024

Sad, actually Linux distributions discourage providing wheels with packages.

I think you’re misinterpreting the comment you quoted: it wasn’t saying that distributors should use wheels as distributions, but referring to the fact that modern python install tools always build a wheel from an sdist in their process (before moving files to install locations).

@chrysle
Copy link
Contributor

chrysle commented Jan 29, 2024

referring to the fact that modern python install tools always build a wheel from an sdist in their process (before moving files to install locations).

I know that, and only bemoaned the fact that packagers don't make actual use of these always given prerequisites.

@webknjaz
Copy link
Member

There's legitimate reasons behind that. Distro repos are currated and tested as a bundle — all the downstream packages are tested against each other, not in isolation. The platform-specific wheels especially, contained bits of other software that is usually available in those ecosystems through a native package. The system-wide software usually expects that apps are linked against the same system libs.
Moreover, they prefer dynamic linking over static so that whenever they update the underlying libraries and bring in security updates, they don't actually need to rebuild all the packages in their repo and force the end-users to update their packages with that.
Platform-specific wheels vendor the libs, which means that if a security vulnerability is found in a bundled shared object, they have to rebuild and re-release so that the pip users would be able to get the fixes.
Another reason for dynamic linking is that each app doesn't have a copy for statically built libs, which means saving space on disk as many wheels would have duplicate libs just because of the way and considerations of manylinux-adjacent standards.

With platform-specific wheels the burden of shipping security patches that come from dependencies falls on the shoulders of the upstream maintainers. But with downstreams, the platform does it for them.

Plus, most downstream have a promise of packaging reviewed/compliant software in their repos. Which effectively means a transparent process of building from source.

The point is that the downstream ecosystems are their own use-case with their own legitimate prerequisites. They have a separate user base as well (mostly regular users and sysadmins as opposed to the devs).

@webknjaz
Copy link
Member

Oh, and they do use the wheel build process in their build systems. But they set it up slightly different. What gets into their respective site-packages/ is effectively the content of those specially built wheels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: guides component: specifications state: blocked type: task Something that needs to be done that is not a bug or feature
Projects
None yet
Development

No branches or pull requests

5 participants