Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring over PEPs 517, 518, and 660 to the specs section #955

Open
brettcannon opened this issue Jul 21, 2021 · 20 comments · May be fixed by #1111
Open

Bring over PEPs 517, 518, and 660 to the specs section #955

brettcannon opened this issue Jul 21, 2021 · 20 comments · May be fixed by #1111
Labels
component: specifications type: task Something that needs to be done that is not a bug or feature

Comments

@brettcannon
Copy link
Member

As of right now various specifications reference those PEPs but there is no complete view of what they contain (e.g. the sdist spec references PEP 518 for what pyproject.toml although technically it should also reference PEP 517 for being to actually build an sdist from a source tree). We should probably have a single specification that covers how to build sdists and wheels from a source tree which the sdist and wheel format specs can reference on how to end up with those binary artifacts.

@CAM-Gerlach
Copy link
Contributor

We should probably have a single specification that covers how to build sdists and wheels from a source tree which the sdist and wheel format specs can reference on how to end up with those binary artifacts.

Would this involve unifying PEP 517 and PEP 660 in the resulting PyPA spec, with PEP 518 specifying the pyproject.toml format, or are you suggesting globbing that in there two? I'd think the former might be more sensible to keep the specs focused and at a manageable size, though I'd obviously defer to your judgement here.

@brettcannon
Copy link
Member Author

Would this involve unifying PEP 517 and PEP 660 in the resulting PyPA spec, with PEP 518 specifying the pyproject.toml format, or are you suggesting globbing that in there two?

Probably all of that into a single spec. the build-system table in pyproject.toml is the unifying bit here. Basically we should have a spec for specifying the build tool for a distribution. That covers PEP 518, 517, and 660 as none of those can be specified without the previous PEP.

But I'm not doing the work and I can also see an argument to potentially do it separately.

@JDLH
Copy link

JDLH commented Jun 20, 2022

…We should probably have a single specification that covers how to… [details elided]…

I agree. Please see related discussion at Discuss: What new Reference or Explanation would usefully reduce confusion?.

My thought is that there should be two specifications, one for the behaviour of tools and one for the format and location of the pyproject.toml file:

The main need I see is for a Reference (in the Diàtaxis sense) about the interaction of tools for building source distributions and wheels with declarative project metadata. Right now, PEP 517’s Build backend interface seems to be the authority on this. PEP 517’s Terminology and goals contributes. Part of what is not clear in PEP 517 is that the scope of “build” is narrow, limited to sdists and wheels. This confused me and some others.

Also, I don’t see a PyPA Specification which really describes the pyproject.toml file. There is a PyPA Declaring build system dependencies specification, but it is a stub which refers out to PEP 518. PEP 518 defines the pyproject.toml file overall, and the [build-system] and [tool] tables therein. PEP 517 defines the build-backend key in the [build-system] table. The Pypa Declaring project metadata specification describes the metadata parts of pyproject.toml well, but not the basic format or the [build-system] and [tool] tables.

I suggest a need for a Reference specifying everything about pyproject.toml files in one place. Maybe the shortest path is to add to the existing Declaring project metadata Specification everything from PEP 518 and PEP 517 about pyproject.toml that is not already there. It could continue with the existing format. The metadata heading would become a section of the overall Reference. It would need a more general title. Alternatively, there could be a new Specification about pyproject.toml overall, and it could delegate the metadata fields specification to the existing document. In either case, the Declaring build system dependencies stub gets replaced by another document.

In the thread at Discuss, I ask if we have a consensus on the desirability of creating this/these specification(s). Do we?

@CAM-Gerlach
Copy link
Contributor

In the thread at Discuss, I ask if we have a consensus on the desirability of creating this/these specification(s). Do we?

Migrating the specs in these PEPs to the PyPA specifications site is uncontroversial, formally approved and well understood to be beneficial and necessary in the long term. The major reason it hasn't happened so far is simply due no one with the relevant skills/background and motivation to do it having yet been able to find the time.

It might be possible to guide someone else without a strong background in the packaging ecosystem in what to do, since its mostly an editorial and mechanical task, and we could help specify the high-level organization. However, the person would need to have a strong grasp of reST/Sphinx syntax, roles, directives and other constructs, and be a solid technical writer, and I'd be concerned that the time it would take to guide the contributor and address issues during review would be more time-consuming for both parties involved than me just dedicating a day to it and knocking it out.

On the other hand, it would be uniquely valuable to be a user like yourself relatively new to the details of packaging but interested and motivated to understand it better, when it comes to helping revise the existing user-focused PyPUG "Tutorial" and "Explanation" type content, and in particular @cameron-simpson 's proposed high-level overview of the packaging ecosystem.

@CAM-Gerlach
Copy link
Contributor

It seems there are four different approaches suggested this far:

Summary of proposed approaches (click to expand)
  • @brettcannon 's proposal, combining PEP 517, 518 and 660 into one document, keeps everything build-system related in one place. However, taken as-is, this also would mean the high-level spec of the pyproject.toml file and its top-level tables would be embedded within the build system spec, when only one of those tables is build-system related, and covered by other specs.
  • My original proposal, moving PEP 518 into one spec and PEP 517/660 into another, nominally separates the file format from the backend interface. However, taken as-is, this bifurcates the description of the build-system table between two specs, creating a circular dependency and makes things hard to follow.
  • The first proposal above, putting everything pyproject.toml-related in one spec, keeps everything related to that file in one place. However, it requires moving existing content, mixes a bunch of different concerns and could become unmanagably long to navigate.
  • The second proposal, an overall document for pyproject.toml that delegates to the existing specification for the project table, avoids most of these issue, but its not clear what it proposes to do about the build backend interface specs, which are the main thrust of PEP 517/660; including them in the same document basically reduces to @brettcannon 's proposal

After thinking about these ideas some more, re-reading the PEPs and mocking up some outlines, I propose migrating the aforementioned PEPs via the following structure. This includes a dedicated high-level document for pyproject.toml that delegates and links the standardized top-level tables to individual specifications, which makes it an easily discoverable, navigatable and maintainable one-stop shop without becoming either unmanagably long or buried in another not-directly-related spec, avoids requiring any modification to existing migrated specs, maintains consistency, parallelism and extensibility with each top-level table corresponding to one spec document.

Further, the below proposal separates the specification of the frontend-backend Python hooks and interface from the declaration of the build-system table options, since these are related but separate concerns with distinct audiences (author-specified static configuration vs. internal frontend-backend runtime interaction), make the resulting documents shorter and more focused (to wit, I tried and failed to come up with a concise but descriptive title that covered both), and roughly parallel the separation of the project table format from the underlying Core Metadata fields and semantics. However, if required, those can be combined into the same document, with the former being a top-level section ("The build-requires table") of the latter.

Proposed structure with mapping from existing PEP sections (click to expand)

@JDLH
Copy link

JDLH commented Jun 27, 2022

It would be good for a pyproject.toml specification to answer the question, is a project permitted to add extra stanzas and keys not defined by the spec? Or is everything forbidden unless specifically permitted?

Source: this Discuss question: Adding extra fields in the pyproject.toml authors/maintainers list.

…it ocurred to me that adding other identifiers (like ORCID ) is something that makes sense for our use case, where all contributors to the codebase are added as authors to scientific papers describing the software.…

@CAM-Gerlach
Copy link
Contributor

CAM-Gerlach commented Jun 27, 2022

As mentioned on #1093 , I hope to have a draft of the above proposed structure up as a PR here by sometime this week, presuming no major objections (though I've given due consideration to such alternatives, it is relatively flexible to adapt if reviewers feel that, e.g. "Declaring a project's build system" should be a subsection of "Build backend interface specification", or that the hooks in the latter should be organized by required/optional rather than build type).

It would be good for a pyproject.toml specification to answer the question, is a project permitted to add extra stanzas and keys not defined by the spec? Or is everything forbidden unless specifically permitted?

The process of migrating the specifications described in this issue is strictly an editorial one; to that end, I've been careful to map existing atomic normative sections in the approved PEPs as directly as practicable to those of the proposed unified specifications, aside from any changes from subsequent approved PEPs (e.g. explicitly adding the project table from PEP 621/Declaring Project [Source] Metadata to the top-level pyproject.toml spec.

Any non-trivial changes to the normative content need to follow the PyPA specification update process, which depending on the scope of the change, range from a normal pull request on this repo (once the spec is migrated) for trivial corrections, typos and similar; a Discourse discussion for small but potentially content-relevant changes, or a new PEP for significant changes.

In any case, the question being asked there concerns keys in the project table, which are defined by the Declaring project [source] metadata specification already hosted here, which explicitly states that "No tools may add fields to this table which are not defined by this specification" and that the tool table must be used instead, as it is explicitly reserved in PEP 518 for such. I've replied on that thread with more details and suggested approaches directly addressing the OP's question.

Also, note that PEP includes "additionalProperties": false in in its JSON schema, which specifies that no new non-standardized top-level tables should be added to pyproject.toml either (since again, tool exists for exactly that purpose).

@pradyunsg

This comment was marked as off-topic.

@pradyunsg
Copy link
Member

Ah, @CAM-Gerlach responded to your question on https://discuss.python.org/t/adding-extra-fields-in-the-pyproject-toml-authors-maintainers-list/16848/3 as well. I'll mark the preceding discussion as off-topic, since it's more of a question about the spec, rather than related to actually moving them over.

@CAM-Gerlach
Copy link
Contributor

CAM-Gerlach commented Jun 27, 2022

As an on-topic portion of my comment was hidden as well (I should have initially, collapsed the off-topic portion with <details>), I will repeat it again here:

As mentioned on #1093 , I hope to have a draft of the above proposed structure up as a PR here by sometime this week, presuming no major objections (though I've given due consideration to such alternatives, it is relatively flexible to adapt if reviewers feel that, e.g. "Declaring a project's build system" should be a subsection of "Build backend interface specification", or that the hooks in the latter should be organized by required/optional rather than build type).

Ah, @CAM-Gerlach responded to your question on discuss.python.org/t/adding-extra-fields-in-the-pyproject-toml-authors-maintainers-list/16848/3 as well.

Well, actually, I answered the Discourse thread OP's question over there :) ; I still suggest @JDLH read my collapsed reply here, as it attempts to clear up some broader confusion about the what is in and out of scope for this proposed spec migration, and what he can do if he would like to change the content of the specs themselves, as it appeared he was requesting previously on the other Discourse thread.

@JDLH
Copy link

JDLH commented Jun 27, 2022

Thank you for the replies. I intended my comment as a test case for the usefulness of the new specification, not as a proposal for a substantive change. Thus I sort of think it is on-topic.

This issue talks about writing a specification document that will explain to app developers about what goes into pyproject.toml. The Discuss thread has an app developer asking about what goes into pyproject.toml. Will the new specification proactively answer that question? Or does it omit the answer due to imperfect wording, but the original PEP has the answer? (In which case, an editorial improvement to the spec is desireable.) Or does the original PEP not answer the question? (In which case, this process has uncovered a gap in the substance, and a PyPA specification update is desirable.)

@CAM-Gerlach
Copy link
Contributor

This issue talks about writing a specification document that will explain to app developers about what goes into pyproject.toml.

Well, not quite—this issue is about moving existing specification content regarding how build frontends and backends interact to build sdists and wheels, to the PyPA spec section to serve as reference material for packaging tool implementors, which is not really the same thing as a new "explanation"-oriented document aimed at regular Python developers. If you think such a document would be useful (and there certainly seems to be plenty of room for such), I encourage you to open an issue.

However, the particular question being asked was about the content of the [project] table, already covered in its own PEP and PyPA specification, rather than the build-system table defined in PEP 518, and is in fact specific to packaging tool maintainers and thus to the specification itself, as the specific feature requires a custom tool to implement it, which would then need to determine (and inform users in its own documentation) where and how to handle the new metadata in the tools table (unless it would be standardized, in which case it would no longer be a "key not defined by the spec" as in your original question).

Will the new specification proactively answer that question?

While this is a question that is already answered by an existing specification and not actually part of the three core PEPs in question, the structure I propose to implement here should greatly help the important discoverability problem you and the OP highlighted in how it was rather non-obvious to find, underscoring my rationale for it, as it includes a dedicated top-level document for the pyproject.toml specifications named as such, which in turn links to the specifications for each of the top-level tables. Therefore, those interested would be able to easily find the top-level pyproject.toml spec and then quickly navigate to the specification for the top-level table they are interested in. As such, I'm further convinced that is the right approach here, as opposed to burying the pyproject.toml spec in a unitary build spec.

@JDLH
Copy link

JDLH commented Jun 28, 2022

this issue is about moving existing specification content …to the PyPA spec section to serve as reference material

I think we agree that we are talking about reference material, what Diàtaxis calls a Specification, not an Explanation.

…regarding how build frontends and backends interact to build sdists and wheels … for packaging tool implementors

Which entity is expected to type in values into the fields of the [project] table in pyproject.toml? A build frontend? In my — admittedly, very limited — experiments, my frontend was "python -m build". I did not see it modify the pyproject.toml file. I would have to type in name=Jimomatic into the [project] table myself. If it is expected that app developers edit pyproject.toml directly, to declare their project metadata, then surely app developers are also an audience for this reference material?

Or is it specified that build frontends shall collect project metadata from app developers, and write it in pyproject.toml on behalf of the app developer? In that case, certainly, the app developer does not need to refer to this material, only the build frontend developer. But also, the specification document should state that interaction.

@CAM-Gerlach
Copy link
Contributor

I think we agree that we are talking about reference material, what Diàtaxis calls a Specification, not an Explanation.

Err, the Diataxis term for "reference" is..."refrence"?

But really, I was rather imprecise with the Diataxis allusions; the specification is not primarily user-facing documentation at all, but rather a formal, technical standards document.

Which entity is expected to type in values into the fields of the [project] table in pyproject.toml? If it is expected that app developers edit pyproject.toml directly, to declare their project metadata, then surely app developers are also an audience for this reference material?

Both package authors and packaging tools (though not build backends/frontends directly, as opposed to wrappers, converters and other tools) may generate/update the pyproject.toml; e.g. Setuptools' ini2toml, some packaging tools that generate the pyproject.toml from another file/data source, others that generate it automatically from a template for multiple sub-repos, etc. But I'm a little confused how this has to do with the original user's question, which explicitly asked about a point which was only really relevant to packaging tool maintainers.

Furthermore, this is not user-facing documentation (though it may incidentally serve that role), but rather a formal, normative specification. If there is a strong need which it does not fill in the former area, then a new document should be created instead to do so. If you would like to propose changes to another specification, or new user-facing documentation, I think the most productive course of action for all involved would be to open an appropriate issue/PR.

@JDLH
Copy link

JDLH commented Jul 1, 2022

…I'm a little confused how this has to do with the original user's question
…the specification is not primarily user-facing documentation at all, but rather a formal, technical standards document.…

You seem to be making a distinction between "user-facing" and "formal, technical standards document", which I think I look at differently. That may lead to us talking past each other a bit

…Both package authors and packaging tools … may generate/update the pyproject.toml

What this says to me is that package authors are part of the audience for the formal, technical standards document defining pyproject.toml. Sure, developers of packaging tools may be the more important part of that audience. And sure, the specification is not the gentle introduction a package author should start with. But I believe that package authors will absolutely have reason to refer to the specification from time to time.

For instance, if as a package author, I get the idea of "adding other identifiers (like ORCID ) is something that makes sense for our use case", I would expect that I could look up the relevant section in the specification and get an answer about whether that is allowed. Would you consider this a "user-facing" activity? I would call it, author of an artifact finding out what the rules really are for that artifact.

Bringing this back to the core of this Issue 955, I make these comments in an attempt to suggest some requirements that a formal standards document should meet, to do an even better job of bringing PEPs 517, 518, and 660 to the specs section. Someone else gets to decide whether to adopt those requirements for this document.

I think we agree that we are talking about reference material, what Diàtaxis calls a Specification, not an Explanation.

Err, the Diataxis term for "reference" is..."refrence"?

Oops, you got me there. Thank you for correcting me.

@CAM-Gerlach
Copy link
Contributor

But I believe that package authors will absolutely have reason to refer to the specification from time to time.

To be clear, I don't mean to imply that package authors aren't a secondary audience for the document; they just aren't the original intended primary audience, as I understand it. That being said, small editorial changes to clarify unclear language seem fairly likely to be accepted, and in a broader sense changes that aid in understanding and don't harm the specification's primary purpose as such seem, to me, to be in scope to.

On that note, in python/peps#2680 , @pradyunsg , @brettcannon and I discussed changes to improve the spec for both audiences, in particular moving over the examples from PEP 621 to here to aid understanding and reduce the potential for ambiguity and confusion. Perhaps that might be something someone else could help with if I don't get to it first, presuming they are somewhat familiar with technical writing and reST/Sphinx syntax? 😉

For instance, if as a package author, I get the idea of "adding other identifiers (like ORCID ) is something that makes sense for our use case", I would expect that I could look up the relevant section in the specification and get an answer about whether that is allowed. Would you consider this a "user-facing" activity? I would call it, author of an artifact finding out what the rules really are for that artifact.

If it involves injecting something special within the value of the existing author key, then sure, and the spec lays out exactly what is permissable, linking to the appropriate resource (RFC 822) for more information:

The exact meaning is open to interpretation — it may list the original or primary authors, current maintainers, or owners of the package.

The name value MUST be a valid email name (i.e. whatever can be put as a name, before an email, in RFC 822) and not contain commas.

So long as it only involves valid characters as defined for an email name in RFC 822, then its fine. However, if it involves injecting a whole new key, then it would require, at a minimum, that one or more packaging tools explict add and document support for it, for which the packaging tool's normative documentation would be the canonical reference. Of course, that would still not be allowed by PEP 621, and conforming tools would produce a clear error message when attempting this. In the Discourse thread, I'd suggested a clarification to make explicit that any sub-table should also be as-specified in the PEP, without any non-standard keys.

Bringing this back to the core of this Issue 955, I make these comments in an attempt to suggest some requirements that a formal standards document should meet, to do an even better job of bringing PEPs 517, 518, and 660 to the specs section. Someone else gets to decide whether to adopt those requirements for this document.

My curmudgeon-ing aside, like I mentioned on Discourse, its always good to have a pair of eyes on something technical one writes that they may be more familiar with than many readers, since its easy to leave things implicit, assumed or unclear to others. I've seen that from both sides myself, including on PEP 621 itself.

@CAM-Gerlach
Copy link
Contributor

Its finally done! Opened a PR as #1111

@pradyunsg
Copy link
Member

pradyunsg commented Jul 25, 2022

I'll consolidate this into #1093, since we're tracking this as part of a broader effort now. Thanks for the discussion so far folks, and please take a look at the PR listed above! :)

@CAM-Gerlach
Copy link
Contributor

CAM-Gerlach commented Jul 25, 2022

Huh, why now, after all this, when I finally just opened a PR specifically to close this issue? I'm confused...

@pradyunsg
Copy link
Member

I mean, we can close this via your PR, I don’t mind that either. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: specifications type: task Something that needs to be done that is not a bug or feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants