Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the description of how runtime dependencies are found #239

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

noamraph
Copy link
Contributor

I find the issue of how Nix decides on dependencies to be quite mysterious. I wasn't sure what exactly was meant with the current description in chapter 9. So I did some searching, and some experimentation, and got to the algorithm that I wrote in this PR.

What do you think?

I would really like it if someone verifies that I'm right. I couldn't find any authoritative reference on the exact algorithm, and a lot of references to this pill, so what will be written here will probably be used as a reference by others.

@noamraph
Copy link
Contributor Author

BTW, I cheated a bit - since I don't have the original derivation mentioned in the text, I gave a fake hash for the output path. I think that this is the least confusing method, and I don't see how the cheat could harm anyone, so I prefer not to bother the reader with the fact that I cheated.

@jtojnar
Copy link
Contributor

jtojnar commented Apr 10, 2024

The scanning happens here:

https://github.com/NixOS/nix/blob/a268c0de7192188c7233bf83a4635198c360e270/src/libstore/build/local-derivation-goal.cc#L2320

Relevant inputs:

https://github.com/NixOS/nix/blob/a268c0de7192188c7233bf83a4635198c360e270/src/libstore/build/local-derivation-goal.cc#L2220-L2227

The algorithm itself is in this file:

https://github.com/NixOS/nix/blob/3fd8dfec4d0747e8f1129dc7f43c1f89fe3ba432/src/libstore/path-references.cc#L19-L70


What you describe is about right except that derivation can have multiple output paths. But then again, the original article only mention out too.

@noamraph
Copy link
Contributor Author

Thanks @jtojnar! Perhaps you can explain to me what this means, from local-derivation-goal.cc?

The paths that can be referenced are the input closures, the output paths, and any paths that have been built via recursive Nix calls.

  • Are "input closures" the build dependencies? Does it include dependencies of dependencies?
  • Why should we search for output paths? Or perhaps those are the dependencies, and if so, what is the input closure?
  • What paths are built via recursive Nix calls?

If it's too complicated, then perhaps it doesn't matter, and we should just add a sentence saying that what is described is an approximation, and that the exact behavior is defined by the source code, with a link to the source.

@jtojnar
Copy link
Contributor

jtojnar commented Apr 11, 2024

Are "input closures" the build dependencies? Does it include dependencies of dependencies?

Yes, closure refers to the transitive closure of dependency relation. In this context, it means the result of applying the relation to the set of input derivations.

The property is defined here:

https://github.com/NixOS/nix/blob/3fd8dfec4d0747e8f1129dc7f43c1f89fe3ba432/src/libstore/build/derivation-goal.hh#L152-L156

and populated here:

https://github.com/NixOS/nix/blob/3fd8dfec4d0747e8f1129dc7f43c1f89fe3ba432/src/libstore/build/derivation-goal.cc#L600-L649

also see the populating function:

https://github.com/NixOS/nix/blob/5b9cb8b3722b85191ee8cce8f0993170e0fc234c/src/libstore/store-api.hh#L644-L658

Why should we search for output paths? Or perhaps those are the dependencies, and if so, what is the input closure?

As you might have noticed in #237, one kind of dependencies (inputs) are derivations. And each derivation realizes into at least one output (named out by default). So we scan for outputs of the built derivation’s transitive input derivations.

We need to scan for outputs to be able to discard inputs that are not necessary at run time (e.g. to allow garbage collecting them).

What paths are built via recursive Nix calls?

Recursive Nix is an experimental feature, that allows running nix-build as part of build of derivation.

@noamraph
Copy link
Contributor Author

Thanks @jtojnar for the explanation! I updated the description so it now seems to me quite accurate according to what I understand. I also added one of your links to the source code, since I think it's useful to have a link to the authoritative definition.

WDYT?

Comment on lines +57 to +58

(For completeness: some derivations have multiple output paths. In that case, Nix will search for the hashes of all the referenced outputs. Also, Nix will search for the hashes of source dependencies, such as our `build.sh` file. The authoritative definition is the [source code](https://github.com/NixOS/nix/blob/a268c0de7192188c7233bf83a4635198c360e270/src/libstore/build/local-derivation-goal.cc#L2220-L2227).)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(For completeness: some derivations have multiple output paths. In that case, Nix will search for the hashes of all the referenced outputs. Also, Nix will search for the hashes of source dependencies, such as our `build.sh` file. The authoritative definition is the [source code](https://github.com/NixOS/nix/blob/a268c0de7192188c7233bf83a4635198c360e270/src/libstore/build/local-derivation-goal.cc#L2220-L2227).)

let's skip this for now I think but do the rest:

Basically, I don't think the inputSrcs vs inputDrvs distinction is very useful for the user. From the user's perspective I think it is better to think that the inputs are just the closure of some set of store objects, and those store objects are specified either directly by store path, or by (drv path, output name) pair.

Also I rather not link to source code, as stuff gets moved around fairly frequently. (Yes, there is a commit hash in there, but old code can also be misleading.) Hopefully this stuff gets some proper reference docs soon, and then the the nix pills can link it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer to leave this paragraph. If someone isn't interested in the details, it's very easy to skip, as it's between parentheses and starts with "for completeness". However, for me it feels better to feel that I better understand the details of this "magic". I think that specifically mentioning source inputs in addition to derivation inputs has value, since the description about taking the output of a derivation doesn't apply for those inputs.

Regarding the link to the source code, I agree it's not ideal, but until there is a proper reference, I think that it's better than nothing. When there is a proper reference, we should obviously link to it instead of the source. Do you think it will be better if we make it clearer that you should check the current sources and not what's in the actual link? For example, we can have:

The authoritative definition is the source code; you can start from here and check what's changed since the time of writing.

Also, reading it again, if we decide to leave this paragraph, I think I would make the description clearer. Instead of:

Also, Nix will search for the hashes of source dependencies, such as our build.sh file.

Write:

Also, in addition to dependencies which are derivations, there are source dependencies, such as our build.sh file. Nix will also search for the hashes of those, and if found, will add them as runtime dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

None yet

3 participants