Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0171] Default name of fetchFromGithub FOD to include revision #171

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

jonringer
Copy link
Contributor

@jonringer jonringer commented Mar 20, 2024

Make fetchFromGithub FOD name more meaningful. This avoids stale artifacts and gives more content fidelity when looking at nix store paths.

Rendered: https://github.com/jonringer/rfcs/blob/jringer/fetch-from-github/rfcs/0171-fetch-from-github.md

Items for further refinement:

  • Other version control fetchers
  • Prefix vs suffix of src label

rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
Jonathan Ringer and others added 2 commits March 19, 2024 19:04
@Aleksanaa
Copy link
Member

Just one silly question: why only fetchFromGitHub but not including fetchFromGitLab and others, besides the reason that it is very common?

Although fetchFromGithub is one of many fetchers; it is very common, and generally has a user specify granular source information which makes differentiating between sources easy.

@jonringer
Copy link
Contributor Author

jonringer commented Mar 20, 2024

Just one silly question: why only fetchFromGitHub but not including fetchFromGitLab and others, besides the reason that it is very common?

I did reference this

- Similar treatment to similar fetchFromGitX helpers?

I mention fetchFromGitHub because it's about 20x as common:

$ rg -i fetchFromGitHub | wc -l
33178
$ rg -i fetchFromGitlab | wc -l
1478

I'm not opposed to doing a similar treatment to the other fetchFromX helpers. I think this RFC could be expanded to encompass them all eventually.

To your point, maybe this should be relabeled as "Default name of fetchFromX FOD helpers to include revision"

@eclairevoyant
Copy link
Member

There's already a speculative impl as of 2 weeks ago: NixOS/nixpkgs#294068

Though the format is slightly different ("source-${owner}-${repo}-${rev}", essentially)

@ShamrockLee
Copy link

ShamrockLee commented Mar 23, 2024

Though the format is slightly different ("source-${owner}-${repo}-${rev}", essentially)

I choose the format to make it easier to inspect which source FOD is going wrong when several packages got rebuild at the same time.

Update: Just came across NixOS/nixpkgs#49862, which seems to work on the package names of fetchers.

Comment on lines 69 to 77
- "Interchangeability" with other fetchers is diminished as the derivation name is different
- In practice, fetchFromGitHub is never used in this way. It is generally the only fetcher, so there is never another FOD to dedupilicate.
- Out-of-tree repositories may get hash mismatch errors
- If the cause of the mismatch is staleness, this is good and working as intended
- If the cause is non-determinism, this is frustrating.
- Some derivations assume "source" to be the name of sourceRoot
- This has been mitigated over two years within Nixpkgs
- Out-of-tree code may break if they assume "source" is the name
- Can be mitigated with release notes describing the issue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are caused by the assumption of src.name == "source" being broken.

However, we have already deprecated such assumption as early as Nixpkgs 21.11. Nixpkgs Manual now requires sourceRoot to start with "${src.name}" instead of "source" when src is constructed with by fetchgit-based fetchers, and tree-wide conversions (NixOS/nixpkgs#245388, NixOS/nixpkgs#247977, NixOS/nixpkgs#248528, NixOS/nixpkgs#248683, NixOS/nixpkgs#294334) have been merged.

fetchFromGitHub and other output-as-a-directory fetchers can still be used interchangeably if we stick to ${src.name} instead of "source".

In my opinion, we do not provide bug-level compatibility to packages failing to follow already-stablized specifications, in-tree or out-of-tree.

@ShamrockLee
Copy link

maybe this should be relabeled as "Default name of fetchFromX FOD helpers to include revision"

IMO, "fetchers for version-controlled repositories" would be a suitable target. This includes fetchers based on specific version control systems (e.g. fetchcvs, fetchsvn, fetchgit, fetchhg, fetchbzr, etc.) and fetchers based on specific service providers (e.g. fetchFrom*).

@Ericson2314
Copy link
Member

Hehe per #133 (comment), my dream of single hash git fetchers (avoiding the need to put things in the name like this) is now a good bit closer!

@ShamrockLee
Copy link

Hehe per #133 (comment), my dream of single hash git fetchers (avoiding the need to put things in the name like this) is now a good bit closer!

Congratulations!

We still need to fix those fetchers at Nixpkgs level, though, since Nixpkgs often takes years to adopt new Nix features.

Comment on lines +95 to +101
- Full commit hashes could be truncated. This sacrifices a bit of simplicity for better looking derivation names:
```
let
version = builtins.replaceStrings [ "refs/tags/" ] [ "" ] rev;
# Git commit hashes are 40 characters long, assume that very long versions are version-control labels
ref = if (builtins.stringLength rev) > 15 then builtins.substring 0 8 version else version;
in lib.sanitizeDerivationName "${repo}-${ref}-src";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't recommend this. Short hashes are not guaranteed to be stable for long term storage.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

40 characters are not really that long for a machine-generated derivation.

People would most likely see the store hash and (hopefully) part of the name when it flashes through their terminal. During debugging, a path like that would occupy at most a bit more than one line inside the terminal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current "source" isn't that stable either. 8 characters is still a fair amount of entropy, and likely to be different enough for most repositories.

Copy link
Contributor Author

@jonringer jonringer Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also remember, that it doesn't have to be "unique across all time". It just needs to be different than what was there before, so that the combination of name + hash are different.

@infinisil infinisil changed the title Default name of fetchFromGithub FOD to include revision [RFC 0171] Default name of fetchFromGithub FOD to include revision Mar 28, 2024
@MMesch MMesch added the status: open for nominations Open for shepherding team nominations label Apr 2, 2024
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-04-02/42643/1

@ShamrockLee
Copy link

ShamrockLee commented Apr 3, 2024

I'd like to nominate myself as a shepherd. I previously authored a related implementation, NixOS/nixpkgs#294068, which could inform the discussion around this RFC's design.

I'm excited about this RFC and look forward to working with the community to shepherd it through the process.

Co-authored-by: Yueh-Shun Li <shamrocklee@posteo.net>
@jonringer
Copy link
Contributor Author

IMO, "fetchers for version-controlled repositories" would be a suitable target. This includes fetchers based on specific version control systems (e.g. fetchcvs, fetchsvn, fetchgit, fetchhg, fetchbzr, etc.) and fetchers based on specific service providers (e.g. fetchFrom*).

If we have a shepards meeting, we can refine the scope.

@oxij
Copy link
Member

oxij commented Apr 4, 2024

Yes, NixOS/nixpkgs#49862 is very related, in its latest reincarnation it allows you to keep both *-source names on Hydra and generate pretty *-<name>-<version>-<optional fetcher>-source names with config.nameSourcesPrettily option set. See https://github.com/NixOS/nixpkgs/pull/49862/files#diff-1977c7748af8b43b92093d2383e23e88c4df10a702578fbe885675a0956a2f1f

@ShamrockLee
Copy link

To code the fetcher name into the name attribute, the default name could also be ${repo}-${rev}-github-source.

# Detailed design
[design]: #detailed-design

Change the default name of fetchFromGithub fetcher from `"source"` to `lib.sanitizeDerivationName "${repo}-${rev}-src"`.
Copy link

@sdht0 sdht0 Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two suggestions:

  • How about using pname instead of repo as a sort of middle ground between stability and informative names. Of course then changing the repo url will not automaticaly verify if the hash actually matches, so a tradeoff.
  • Also how about starting with src- instead of at the end? When listing the nix store, all fetched inputs can then be sorted together.

The second suggestion is actually a part of a wish I have to prepend src- to all build inputs (tars, patch files, builders, etc), pkg- to all packages, and nixos- to the generated nixos configs, so that it is easier to tell at a glance why a particular store path is being installed, and also would be useful for tools like nvd when listing diffs. Yeah this would a much broader proposal, but one can dream someday it will happen. For now, we can start with just the FOD fetchers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using pname instead of repo as a sort of middle ground between stability and informative names. Of course then changing the repo url will not automaticaly verify if the hash actually matches, so a tradeoff.

pname is a stdenv.mkDerivation convention, this would require people to pass it separately. The goal here also is to avoid hash mismatch, so I don't think is this is enough to solve the problem this RFC is trying to remedy.

Also how about starting with src- instead of at the end? When listing the nix store, all fetched inputs can then be sorted together.

I wouldn't be opposed to it, but I'm used to <pname>-<version>-<variant> naming, source being a variant in this case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second suggestion is actually a part of a wish I have to prepend src- to all build inputs (tars, patch files, builders, etc), pkg- to all packages, and nixos- to the generated nixos configs, so that it is easier to tell at a glance why a particular store path is being installed, and also would be useful for tools like nvd when listing diffs. Yeah this would a much broader proposal, but one can dream someday it will happen. For now, we can start with just the FOD fetchers.

It sounds like another Nix RFC.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm used to <pname>-<version>-<variant> naming, source being a variant in this case.

It's a bit strange to consider package source a "variant" of the package. Nevertheless, such name-suffixing pattern exists in Nixpkgs -- the vendor FOD assign to goModules attribute of buildGoModule has its name set as "${pname}-${version}-go-modules", and sometimes people manually set the name of in-attribute FODs this way.

On the other hand, language-specific packages (e.g. Python packages, Ocaml packages, etc.) often prefix the name with the name and version of the compiler/interpreter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not specifically tied to the suffix approach.. I'll add a comment to the initial comment

Co-authored-by: Pieter Gerets <123326988+gerpiet@users.noreply.github.com>
@ShamrockLee
Copy link

ShamrockLee commented Apr 7, 2024

People who use the result FOD produced by fetchFromGitHub don't even need to know about what name is. They'll just use src.name to set sourceRoot and use sourceRoot in their build commands.

As an enforcible specification, maybe we don't even need to specify what name should default to in the RFC level, but require the default to contain either version or source revision.

(Whether/how long the revision could be truncated are still in this scope, since it affects the probabity of colisions.)

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfc-171-default-name-of-fetchfromx-to-include-version-information/43001/1

@lheckemann
Copy link
Member

This RFC has not acquired enough shepherds. This typically shows lack of interest from the community. In order to progress a full shepherd team is required. Consider trying to raise interest by posting in Discourse, talking in Matrix or reaching out to people that you know.

If not enough shepherds can be found in the next month we will close this RFC until we can find enough interested participants. The PR can be reopened at any time if more shepherd nominations are made.

See more info on the Nix RFC process here

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-04-16/43512/1

@philiptaron
Copy link

I'll toss my name in for an RFC shepherd, as I believe in the utility of this proposal.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfc-171-default-name-of-fetchfromx-to-include-version-information/43001/2

@jonringer
Copy link
Contributor Author

@oxij I would like to nominate you as a shepherd, looks like you've thought about this longer than I've been involved in nixpkgs.

I like your addition of a lib function to make derivation names more consistent in NixOS/nixpkgs#49862

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: open for nominations Open for shepherding team nominations
Projects
None yet