Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add buildWorkspace #180

Open
sellout opened this issue Dec 2, 2022 · 4 comments
Open

Add buildWorkspace #180

sellout opened this issue Dec 2, 2022 · 4 comments

Comments

@sellout
Copy link

sellout commented Dec 2, 2022

I’m not sure if this is a good idea, but if not I’d like to hear what the best alternative is (explicit separate buildPackage for each lib in the workspace?)

In any case, I want roughly

  outputs = { self, crane, flake-utils, nixpkgs, ... }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        craneLib = crane.lib.${system};
        pkgs = import nixpkgs {inherit system;};
      in
    {
      packages =
        {default = self.packages.${system}.oneOfTheOnesInTheWorkspace;}
        // craneLib.buildWorkspace {
          src = craneLib.cleanCargoSource ./.;
          doCheck = true;
          buildInputs = [pkgs.libiconv];
          nativeBuildInputs = [];
        };
    });

where buildWorkspace produces an attrset of packages, one per package in the workspace. And the attrs passed to buildWorkspace are propagated to the individual packages. Although maybe it makes sense to put propagated attrs in a single attr, so other buildWorkspace attrs (like, maybe an exclude list or something) can be provided without conflicting or being propagated.

I’m still on the steep part of the learning curve for both Nix and Rust, so I’m as interested in arguments against this as I am in seeing it done.

@ipetkov
Copy link
Owner

ipetkov commented Dec 6, 2022

Hi @sellout thanks for the report! Let me ask a bit more concretely: what potential advantages would you be interested in seeing frombuildWorkspace which would be impossible with buildPackage? (keeping in mind that buildPackage can easily build all binaries in the workspace 😉 )


From the perspective of organizing derivations and their outputs I could see the advantage of being able to select a specific "output" of the workspace (I know crate2nix supports this). That said, I see a couple of reasons why I wouldn't consider this direction (alone) a motivator:

  • implementation costs: this would require even deeper analysis of the project structure to figure out what the outputs are (I don't know of any cargo-related tools which allow for easy/arbitrary queries of the project structure). With build-scripts I have a feeling it becomes an undecidable problem. Even if figuring out all the outputs can be solved (elegantly?) there is still the issue of knowing which targets to actually build. Some targets may depend on others, but having separate outputs is kind of moot if you have to build the whole workspace as the granularity doesn't afford you much of anything (besides maybe a derivation with one /bin/* output)
  • At the end of the day, each project maintainer understands their project structure the best. if they want to build a subset of the workspace I would recommend setting the right cargo flags as to avoid unnecessary work. If they wish to build the entire workspace but split up the outputs into separate derivation outputs (for organization's sake) it is pretty trivial to do with some runCommand invocations.

Where i do see a stronger motivator for buildWorkspace is more efficient incremental caching within the workspace itself. For example, consider a workspace with a large number of crates where the "leaf" crates receive the most amount of changes and the "root" crates receiving much fewer changes. It would be really nice to get incremental caching such that we don't have to rebuild the entire workspace when only the "leaf" crates change.

I've had some ideas for implementing something like this. I think the basic approach would be using something like cargo-guppy to get a topological sort of the workspace's crates. We can then build a chain of buildPackage derivations each depending on the cargoArtifacts of the previous derivation, as well as filtering the source files to only contain the current crate and its dependencies' sources to avoid invalidation.

@sellout
Copy link
Author

sellout commented Dec 15, 2022

So, I could be off the mark here, but here’s what I’m thinking. First, I mostly build libraries, so the flake is largely for developer/CI consistency and efficiency. So yeah, my use case is basically what you state in the second part. I do something similar for other projects. Here is a Haskell example, with less automation than Crane provides: https://github.com/con-kitty/concat/blob/506a9461079fa407df4b81f4d1bb00f84671e7f0/flake.nix

Line 30 processes the “cabal.project” file, which has a packages section somewhat similar to a [workspace] stanza. It then uses that for a few things:

  • line 36: produce an overlay with all the packages in the right place;
  • line 54: produce a set of packages (cartesian product of compiler versions and individual packages, plus one with the full set of packages loaded withPackages style); and
  • line 58: produce a set of devShells (one for each compiler version we care about) with the dependencies for all the packages in the “workspace”.

An extension of the incremental build/caching is that the separate packages are great to have for garnix to get fine-grained CI failure reporting.

About the first part – you’re right, that’s a tough and frustrating problem. Our Haskell solution is hacky, and I was thinking Crane’s could be at least less hacky (but I’m only guessing, I don’t know cargo tooling at all). It sounds like cargo-guppy could do one better than a topological sort, and give you the exact set of direct dependencies that intersect with workspace.members to depend on (and depending on only the -deps package if the intersection is empty.

In Haskell we just depend on the overlays.default, which contains all of the packages from the “workspace”, so each entry in packages doesn’t know its explicit dependencies, but can find any of them via te overlay. Less precise, but easy.

@dpc
Copy link
Contributor

dpc commented Dec 15, 2022

It might be related and/or helpful. In the project I'm using crane for heavily, we have both package-based outputs, and workspace-based outputs. Workspace ones are used for CI, where we quickly and robustly try to build/test everything in aggregate. Packages are useful for OCI containers, or when building things for architectures (like WASM) where not all the code can even compile. Notably, in package builds, we also list all the directories that are actually being used, to help the caching and avoiding rebuild (at the cost of having to manually maintain these lists).

@IreneKnapp
Copy link

hi! I came here to say that I found this explanation for why the feature doesn't exist (yet) to be useful and think it should make its way into the documentation. I think that officially documenting the reasoning here would be worthwhile even if the feature is ultimately added, since it would clarify the relationship between the two functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants