Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename nixosConfigurations to configurations.<system> #10291

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tie
Copy link
Member

@tie tie commented Mar 22, 2024

Motivation

Deprecates nixosConfigurations in favor of configurations.<system> output (where system is semantically equivalent to builtins.currentSystem, that is, system used for realisation, similar to packages and other flake outputs that contains derivations).

See also #6257

Context

This change makes it possible to define cross-compiled configurations. We add configurations attribute because changing nixosConfigurations.<name> to nixosConfigurations.<system>.<name> it’s harder to implement in a non-breaking manner and also configurations (as implemented) is not limited to NixOS configurations (using class attribute).

Limitations:

  • Using this reliably and reproducibly requires content-addressed derivations (that are curently available as an experimental feature).
  • Even with content-addressed derivations, a lot of work has to be done in NixOS/Nixpkgs so that build ≠ host does not change the output.

While not directly related, also keep in mind that NixOS configurations currently support only a single host platform. That is, the resulting system contains executables and bootloader only for a single CPU architecture. E.g. it is not possible to install systemd-boot with per-architecture boot entries to boot the configuration for the specific CPU architecture.

Nixpkgs branch (no PR yet): NixOS/nixpkgs@master...tie:nixpkgs:nixos-rebuild-system-attr

Nixpkgs-side support above is implemented by making uniform flake URI handling across NixOS tools (nixos-rebuild, nixos-install, nixos-container) using nixos-config-flake-uri tool. The tool also handles multiple output formats and escaping (e.g. for nixos-rebuild repl motd expression).

In particular, all NixOS flake-aware tool will have support for absolute attribute paths, adding a transition path from the old schema, for example:

nixos-rebuild --flake oldFlake#.nixosConfigurations.machine

That said, we can also add a flag to nixos-config-flake-uri to evaluate the flake and check attribute path from URI fragment against nixosConfigurations (nix eval --apply to check attribute existence).

Regarding scoping configurations under system attribute:
I don’t think that’s been discussed in a separate issue, although there is nix-community/home-manager#2161, #6257 (comment) and a quote from #8665:

However, in some cases we do want to ignore the CLI context. For instance, nixosConfigurations is not behind a system attribute set, because a machine configuration should be built in one way only. Reproducibly.

While I agree that configuration must be built reproducibly, that does not mean that it has to be built one way only. If it really can’t be cross-compiled to an identical package, then the flake shouldn’t define other system attributes (e.g. packages has similar semantics), and/or use absolute attribute path to be confident that a particular configuration is built. Despite some bugs (and assuming content-addressed derivations), it should be possible to cross-compile an entire configuration that is bit-wise identical to the native build.

Note that #6257 suggests configurations.<class> scheme. While this makes querying configurations of a particular class arguably more efficient, I don’t see much practical value. Tools like nixos-rebuild don’t list configurations (and there is little reason to do so) and nix flake show (as implemented in this PR) prints class value (if it exists). Finally, and perhaps most importantly, configurations.<system> is consistent with packages, check, apps, etc.

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@tie tie requested a review from edolstra as a code owner March 22, 2024 13:59
@github-actions github-actions bot added documentation new-cli Relating to the "nix" command with-tests Issues related to testing. PRs with tests have some priority labels Mar 22, 2024
@tie tie force-pushed the nixos-configurations-system-attr branch from ea06cef to 4df697d Compare March 22, 2024 13:59
This change makes it possible to define cross-compiled configurations.
We add `configurations` attribute because changing
`nixosConfigurations.<name>` to `nixosConfigurations.<system>.<name>` is
a breaking change, i.e. we cannot efficiently differentiate between two
schemas.

Limitations:
- Using this reliably requires content-addressed derivations (that are
  curently available as an experimental feature).
- Even with content-addressed derivations, a lot of work has to be done
  in NixOS/Nixpkgs so that `build ≠ host` does not change the output.

While not directly related, also keep in mind that NixOS configurations
currently support only a single host platform. That is, the resulting
system contains executables and bootloader only for a single CPU
architecture. E.g. it is not possible to install systemd-boot with
[per-architecture boot entries] to boot the configuration for the
specific CPU architecture.

[per-architecture boot entries]: https://uapi-group.org/specifications/specs/boot_loader_specification/#boot-loader-entries
@tie tie force-pushed the nixos-configurations-system-attr branch from 4df697d to bf448de Compare March 22, 2024 13:59
@edolstra
Copy link
Member

The attribute name should include "nixos" to distinguish it from other types of configurations (e.g. home-manager configurations). Maybe nixosConfigurationsFor.<system>?

However, I'm not really convinced that this is needed. Unlike (say) packages, a NixOS configuration is always for a particular machine and therefore a particular system type, and it's not very likely that users will want to query what configurations are available for a system type.

This change makes it possible to define cross-compiled configurations.

I'm not sure how this helps with cross-compilation. It's not even clear whether "system" denotes the target or build system. And we should probably come up with a general scheme for cross-compilation first (e.g. to support packages) rather than just for NixOS configurations.

Using this reliably and reproducibly requires content-addressed derivations

Other than CA derivations reducing binary cache storage if the cross-build produces the exact same result as the non-cross-build, I don't see how CA derivations are needed?

@tie
Copy link
Member Author

tie commented Mar 22, 2024

Other than CA derivations reducing binary cache storage if the cross-build produces the exact same result as the non-cross-build, I don't see how CA derivations are needed?

To keep every bit of the configuration identical down to the store paths. For example, Go compiler binary input would be different on macOS and Linux, but the output of the compiler is the same for a given platform (GOOS/GOARCH/etc). With CA derivations, the following derivation output should result in identical store paths, even if bash input from stdenv differs between systems:

runCommand "hello" { } ''
  echo hello >"$out"
''

The attribute name should include "nixos" to distinguish it from other types of configurations (e.g. home-manager configurations). Maybe nixosConfigurationsFor.<system>?

NixOS module system already has class attribute that should cover this case. Since the main access pattern is not listing configurations, but evaluating an attribute path for a given configuration flake URI, I think it’s enough to check _type and class attributes. E.g. see https://github.com/NixOS/nixpkgs/blob/123de15b1a7546e6779dc2bbdaf0c4fc8cd860bf/pkgs/os-specific/linux/nixos-rebuild/nixos-rebuild.sh#L594-L595

However, I'm not really convinced that this is needed. Unlike (say) packages, a NixOS configuration is always for a particular machine and therefore a particular system type, and it's not very likely that users will want to query what configurations are available for a system type.

This is not for a particular system type, this is built from a system (as in localSystem, not crossSystem).

I'm not sure how this helps with cross-compilation. It's not even clear whether "system" denotes the target or build system.

The system denotes build system, just like with all other system-dependent flake attributes.

And we should probably come up with a general scheme for cross-compilation first (e.g. to support packages) rather than just for NixOS configurations.

I’m not really sure how this is supposed to look. I’ve never really had a problem with cross-compiling packages from flakes. We can already define packages.${localSystem}.myPackage-${supportedSystem} for a list of supportedSystems that myPackage supports to be built for on a given localSystem system (optionally using shorthands like gnu64 for supportedSystem). That allows users to easily build the package from command-line. For anything more complex, there are Nixpkgs overlays.

For example, a flake defined for aarch64-darwin and x86_64-linux systems that exposed hello package that can’t be cross-compiled from every supported flake-exposed system:

├───apps
│   ├───aarch64-darwin
│   │   └───default: app
│   └───x86_64-linux
│       └───default: app
├───overlays
│   ├───default: Nixpkgs overlay
│   └───hello: Nixpkgs overlay
└───packages
    ├───aarch64-darwin
    │   ├───default: package 'hello-1.0.0'
    │   ├───hello: package 'hello-1.0.0'
    │   ├───hello-apple-m1: package 'hello-1.0.0'
    │   ├───hello-gnu32: package 'hello-1.0.0'
    │   └───hello-gnu64: package 'hello-1.0.0'
    └───x86_64-linux
        ├───default: package 'hello-1.0.0'
        ├───hello: package 'hello-1.0.0'
        ├───hello-gnu32: package 'hello-1.0.0'
        └───hello-gnu64: package 'hello-1.0.0'

You can even have packages.${system} for some system that cannot run the package but can cross-compile for other systems, by omitting the default and hello packages that users expect to be built for local system.

Similar structure applies to the configurations.<system>, except that configurations always target one host system, so no suffixes like with packages example above.

@roberth
Copy link
Member

roberth commented Mar 22, 2024

This is not for a particular system type, this is built from a system (as in localSystem, not crossSystem).

Local and cross are kinda bad terminology. I'd recommend the GNU and Nixpkgs terminology build and host, which are also available in NixOS as nixpkgs.{build,host}Platform since somewhat recently.

<system> becomes ambiguous

When both are set, localSystem refers to build. Unfortunately <system> usually refers to the host platform, as evidenced by its value: the platform where the CLI runs.

What you're proposing also differs from Home Manager, which also uses the automatic <system> attribute to refer to the host platform. I guess you could say that its goal is portability, and not cross compilation necessarily.

CA does not mean we should make the build platform a parameter

Even if CA works and the expressions are close to perfect, a cross compiled result is weaker than a natively built one, because far fewer tests will run.
That's assuming the builds reach that level. If they're anything short of that, the user should be in control of how the "build platform impurity" turns out, which means letting the user make an explicit choice, not by deciding which machine to evaluate on, but by making them write configuration: "I want foo configuration to be built on bar64" - ie nixosConfigurations.foo = lib.nixosSystem { modules = [ { nixpkgs.buildPlatform = "bar64"; } more-config ];.
Building on a different platform should be a recorded decision, not a transient circumstance, which is exactly what the platform of the deployer's CLI is: one day you might deploy from your x86_64-linux and the next, it might be your colleague on their aarch64-darwin.

@tie
Copy link
Member Author

tie commented Mar 23, 2024

Local and cross are kinda bad terminology

Agreed, just to be clear:

localSystem, crossSystem
Nixpkgs arguments that describe build and host platforms in a structured manner (or as a string that is elaborated to a structured platform specification). crossSystem defaults to localSystem. In impure evaluation mode, localSystem defaults to builtins.currentSystem string.
nixpkgs.{build,host}Platform
Module options for evaluating Nixpkgs from NixOS configuration. While convenient for simple configurations, it is more efficient to re-use Nixpkgs via nixpkgs.pkgs option, e.g. if Nixpkgs instance already exists for a given localSystem platform (that is, system value from attribute path) that is used for other outputs like packages and checks.

System is not reflected in schema

I’d argue that configurations.<system> is a recorded decision, and that is currently not reflected in the schema. Deploying such configurations right now is a bit awkward—nixos-rebuild --flake flake#machine-aarch64-linux—and confusing since it looks like the machine is aarch64-linux, but it’s not. It gets worse when there are similar configurations for different hardware (e.g. for build farm with varying hardware configurations but same core modules): flake#machine-aarch64-aarch64-linux, flake#machine-x86_64-aarch64-linux, etc. It would be easier if NixOS supported building configurations that can run on multiple host platforms (see per-architecture boot entries I’ve mentioned above), but I don’t think it does right now and it’d take some time to design and implement properly.

The proposed schema makes this decision explicit and for most use cases the migration path is simply s/nixosConfigurations/configurations.$buildPlatformSystem.

There is also an unintended but positive side effect when substituters have (full or part of) the configuration cached. With nixosConfigurations, running nixos-rebuild --target-host usually fails in unexpected ways, for example, it tries to re-exec itself, and it is a bash script with shebang, and bash has some weird logic that causes it to interpret the script instead of just failing on executable format error.

# From aarch64-linux to x86_64-linux, cached configuration, no --fast flag
nixos-rebuild test --verbose --target-host machine --flake .#machine
$ nix --extra-experimental-features nix-command flakes build --out-link /tmp/nixos-rebuild.ZOx7m1/nixos-rebuild .#nixosConfigurations."machine".config.system.build.nixos-rebuild --verbose
$ exec /nix/store/zbra25fvpbkakzp28p0adpgc9952hj78-nixos-rebuild/bin/nixos-rebuild test --verbose --target-host machine --flake .#machine
/nix/store/zbra25fvpbkakzp28p0adpgc9952hj78-nixos-rebuild/bin/nixos-rebuild: line 382: /nix/store/kjpanj8sfda335sca7rswrywnma1m40c-coreutils-9.3/bin/mktemp: cannot execute binary file: Exec format error
Exec from Python:
python3 -- -c 'import subprocess; subprocess.run("/nix/store/zbra25fvpbkakzp28p0adpgc9952hj78-nixos-rebuild/bin/nixos-rebuild")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/nix/store/7ahkv87jj59z90yal5dcrgagz58cqmz6-python3-3.11.6/lib/python3.11/subprocess.py", line 548, in run
    with Popen(*popenargs, **kwargs) as process:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/7ahkv87jj59z90yal5dcrgagz58cqmz6-python3-3.11.6/lib/python3.11/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/nix/store/7ahkv87jj59z90yal5dcrgagz58cqmz6-python3-3.11.6/lib/python3.11/subprocess.py", line 1950, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 8] Exec format error: '/nix/store/zbra25fvpbkakzp28p0adpgc9952hj78-nixos-rebuild/bin/nixos-rebuild'

I understand that technically this is a bug in nixos-rebuild, but it wouldn’t have occurred if .#machine resolved to .#.configurations.aarch64-linux.machine and failed because the configuration does not exist (because it wasn’t tested), and I’d have to explicitly specify absolute attribute path if I wanted to use such configuration.

one day you might deploy from your x86_64-linux and the next, it might be your colleague on their aarch64-darwin.

So, if the configuration isn’t defined to be built from a given system, they’d get an error that it does not exist.

<system> becomes ambiguous

I don’t think it does. This is simply a string that is passed to the builtins.derivation system argument, unchanged. Flake evaluation is pure, hence it’s passed via attribute path instead of builtins.currentSystem. This is the case for packages, apps, checks, formatter, devShells, etc.

Actual use w.r.t. derivation output depends on the top-level attribute, e.g. for apps and devShells it also dictates the host platform, while for packages it depends on the type of package being built, for example, a flake may expose hello (where system is a host platform), but also hello-macos-universal where there is no clear system value equivalent but the host platform is still well-defined.

Far fewer tests will run

NixOS/Nixpkgs has infrastructure for running integration tests under VM and running commands with an emulator. For example, for Go packages:

# Set GOOS/GOARCH/etc to host platform equivalent before running.
go test -exec ${lib.escapeShellArg (stdenv.hostPlatform.emulator buildPackages)} -- "${goPackagesForTest[@]}"

Currently cross-compilation is closer to a second-class citizen in Nixpkgs, but that shouldn’t be the status quo. Sure, not all packages can be updated to run tests when cross-compiling, but I’d like to emphasize that a lot actually can.

@tie
Copy link
Member Author

tie commented Mar 24, 2024

I understand that technically this is a bug in nixos-rebuild

Not related to this discussion, but this actually seems to be a bug in Bash. Kernel returns ENOEXEC in this case and bash happily proceeds with interpreting executable as a script. It should be returning an error if HAVE_HASH_BANG_EXEC is defined and a script begins with a shebang line that refers to an executable file.
See https://git.savannah.gnu.org/cgit/bash.git/tree/execute_cmd.c?id=f3b6bd19457e260b65d11f2712ec3da56cef463f#n6046
(NB ENOEXEC, errno 8, “Exec format error”)

Edit: this behavior is mandated by the standard 🤷
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01

If the execl() function fails due to an error equivalent to the [ENOEXEC] error defined in the System Interfaces volume of POSIX.1-2017, the shell shall execute a command equivalent to having a shell invoked with the pathname resulting from the search as its first operand, with any remaining arguments passed to the new shell, except that the value of "$0" in the new shell may be set to the command name. […]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation flakes new-cli Relating to the "nix" command with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants