New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce new disk partitioning scheme #478
base: main
Are you sure you want to change the base?
Conversation
# SPDX-License-Identifier: Apache-2.0 | ||
# Example to create a bios compatible gpt partition | ||
# | ||
# The partitioning scheme is hardcoded for 500G disk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this comment we could actually make this specific to Lenovo X1 Carbon Gen11 SKU (stock keeping unit) with 500G of internal NVMe. I'm afraid we must build target specific installers as we have not yet described disk space extension in case the internal NVMe is larger. Optionally, the installer could resolve the size of the disk with lsblk
and parametrize the size to generate disk config for generic
use. I think all the other fixed sizes could remain the same and only the remaining space here https://github.com/tiiuae/ghaf/pull/478/files#diff-8a1fbff8658406e95003d28116ba21ced73c81b4fb8d1a5edad5fe5aeef1c427R154 would have to be calculated. Please check with @remimimimimi if this is feasible. The next iterations of installer must support other installer parameters - like user specific secrets - as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optionally, the installer could resolve the size of the disk with lsblk and parametrize the size to generate disk config for
generic
use.
This exact approach assumes that the installer program changes disk-config.nix
file before passing it into disko
utility, which I am pretty sure we want to avoid.
I think all the other fixed sizes could remain the same and only the remaining space would have to be calculated.
Unfortunately, disko
utility for some reason does not support size = "100%"
parameter in that exact place. It fails with error while partitioning the disk. Probably I should have mentioned this in the comment inside the file itself.
The next iterations of installer must support other installer parameters - like user specific secrets - as well.
This is out of scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optionally, the installer could resolve the size of the disk with lsblk and parametrize the size to generate disk config for
generic
use.This exact approach assumes that the installer program changes
disk-config.nix
file before passing it intodisko
utility, which I am pretty sure we want to avoid.
Ok. I'm fine with these assumptions for this iteration so that we can start development of the other areas this enables but I think we've identified a few issues that affect the installer and target device support design moving forward. Given the constraints and your scoping, I propose changing the file location and naming because it's not template
and generic
anymore. It wasn't even in #340 but we kind of agreed to iterate this to proper for this device in the next iteration which is this.
My proposal for placement is <ghaf_repo_root>/targets/lenovo-x1-carbon-disk-config.nix
because this is very specific to that device. I don't think that is a problem because the current iteration of installer only supports X1. If we want template (or rather disko-config.nix
nix module which accepts arguments from installer) and generic x86_64 disko-config, we must change also the dependency in the installer. Feel free to disagree but imo the filename remains confusing if we don't change it.
I think all the other fixed sizes could remain the same and only the remaining space would have to be calculated.
Unfortunately,
disko
utility for some reason does not supportsize = "100%"
parameter in that exact place. It fails with error while partitioning the disk. Probably I should have mentioned this in the comment inside the file itself.
Sure, never too late to add insight in comments.
The next iterations of installer must support other installer parameters - like user specific secrets - as well.
This is out of scope of this PR.
Please see my previous comment. Scoping and addressing dependencies (even if next iterations of development) are both important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, disko utility for some reason does not support size = "100%" parameter in that exact place. It fails with error while partitioning the disk. Probably I should have mentioned this in the comment inside the file itself.
It seems that this should work @unbel13ver .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-Disko maintainer here. Regarding secrets one can use the diskoImagesScript
to generate full systems that also contain secrets. The script is run outside of the NixOS sandbox. Documentation is here: https://github.com/nix-community/disko/blob/master/docs/reference.md#generating-disk-images-with-secrets-included-using-disko
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My proposal for placement is <ghaf_repo_root>/targets/lenovo-x1-carbon-disk-config.nix because this is very specific to that device.
That is reasonable, implemented.
Also I managed to take the rest of the disk in use for StorageVM (should have added "100%FREE" instead of "100%").
@leivos-unikie could you retest this please? I've updated the expected partitioning in the header of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for testing! Those are not related to the matter of this PR (those are not something worth to worry about, anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the detailed comments.
@unbel13ver can you provide test instructions for this PR? What is the expected outcome of new partitioning scheme? |
68c2e84
to
00b7412
Compare
Testing instructions added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- lsblk gives expected output
- ci-test-automation passed
- performance ok
- apps launch
00b7412
to
af1d8c7
Compare
af1d8c7
to
1794271
Compare
]; | ||
}; | ||
}; | ||
vm_storage_a = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the vm_storage_a and vm_storage_b used? For system updates it's clear to me that one would apply changes to alternating partitions, but how does this work for vm state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some ideas that some VMs can be supplied as pre-built images or so, and the end user will not be able to change the contents of that images. Those VMs will be updated together with the main system and must correspond the specific version of the base system.
name = "boot"; | ||
size = "1M"; | ||
type = "EF02"; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EFI standard allows only one boot loader to be registered, how would the second partition used, in case the other "fails"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there is only one bootloader partition. The duplication in this case require some external tooling to switch between A and B, which we decided to avoid.
mountOptions = ["umask=0077"]; | ||
}; | ||
}; | ||
# LVM pool that is going to be encrypted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the encryption done here? Would it be not cryptsetup -> lvm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some discussion going about that in the mail thread. I do not know at this moment how it is going to be implemented, and I decided to remove this comment to avoid the confusion.
}; | ||
}; | ||
# LVM pool that is going to be passed to the Storage VM | ||
other_2 = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are VMs not encrypted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StorageVM partition is not going to be encrypted, since it is supposed that it stores encrypted data. In this case it allows us to backup and recover the stored data safely if needed.
]; | ||
}; | ||
}; | ||
recovery = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the recover also some NixOS? How is it booted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a placeholder currently, but yes, it is going to contain some factory recovery image.
Brian asked me to compare using LVM vs ZFS for this setup. Here a comparison:
I currently do not have the full picture of all how this a/b partitioning will be implemented in the end. Especially around VM storage. |
format = "ext4"; | ||
mountpoint = "/"; | ||
mountOptions = [ | ||
"defaults" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add noatime
for all of the ext4 mountpoints. It's not useful for most applications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint, done
type = "filesystem"; | ||
format = "vfat"; | ||
mountpoint = "/boot"; | ||
mountOptions = ["umask=0077"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like adding "nofail"
for non-root mounts. It makes recovering easier in case there is something wrong with mountpoints. Otherwise systemd will go into "emergency" mode. This can even happen on nixos-rebuild.
mountOptions = ["umask=0077"]; | |
mountOptions = ["umask=0077" "nofail"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint, done
]; | ||
}; | ||
}; | ||
reserved_a = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some comment what this is for would be helpful.
If this is just a spaceholder do we need a filesystem on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the filesystem from the reserved partitions and the recovery partition. I have also added a comment section in the beginning of the file to explain some of its parts.
1794271
to
eb54ea7
Compare
ghaf hardened requires a ZFS kernel config fragment. It's not an issue but it's a dependency to keep in mind.
Yeah, the A/B partitioning was specified and documented before the latest arch documentation process change. It would probably help to collect the prev documented designs and LVM/ZFS comparisons together according to the new process. With such many contributors I'd like to call it a collaborative documentation process to ensure a better full picture for contributors. |
Is it possible to use smaller partitioning size and resizing mechanism later with current implementation? As final image generated will be 260GB which will take more time to generate and flash. If not whether following postboot commands will be valid to execute still? |
New disk configuration provides grounds for upcoming features, such as AB software updates and Storage VM and many more. Signed-off-by: Ivan Nikolaenko <ivan.nikolaenko@unikie.com>
Signed-off-by: Ivan Nikolaenko <ivan.nikolaenko@unikie.com>
I have added a second commit with ZFS configuration so it is easier to compare what has changed. Known issues: For some reason, |
that should be fixed if you rebase against main. that was fixed in disko and the main has been bumped to use the newer version. |
This shouldn't be the case since as Mic92 mentioned NixOS uses an LTS kernel by default, which should generally be supported by ZFS. But I don't see the line in the diff anyway, so it's fine I guess :) |
# Copyright 2022-2024 TII (SSRC) and the Ghaf contributors | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# !!! To utilize this partition scheme, the disk size must be >252G !!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? I suspect the first pool can be a fair bit smaller.
The quota
on a zfs filesystem limits its maximum size but doesn't guarantee availability of the given amount of space. If you really want to ensure the availability, you should set a reservation
or refreservation
on the filesystems.
# - ESP-B : (500M) | ||
# | ||
# First ZFS pool contains next partitions: | ||
# - root-A : (50G) Root FS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we're using ZFS it doesn't make that much sense to have exactly two root filesystems. It would make more sense to:
- Allow more than two, since zfs doesn't limit us to allocating the space ahead of time
- Not set a quota on them, since they should (AFAIU) be immutable and therefore fixed-size anyway
Depending on our threat model, it may also be undesirable to use zfs (for the root filesystem(s)) at all, since this mixes them up with the mutable pieces of the filesystem and doesn't allow us to use something like dm-verity to authenticate the whole partition (guarding against malicious filesystem images, for instance).
I've also personally experienced some unreliability from zfs, where the zfs kernel driver would crash and any further access to files on zfs filesystems would hang until reboot. Using ext4 (with dm-verity to counteract both malicious modification and bitrot) would probably be more reliable.
# - root-B : (50G) | ||
# - vm-storage-A : (30G) Possible standalone pre-built VM images are stored here | ||
# - vm-storage-B : (30G) | ||
# - reserved-A : (10G) Reserved partition, no use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary. ZFS allows us the flexibility to add or remove filesystems from a pool after the fact if we end up needing them.
}; | ||
}; | ||
}; | ||
zroot_2 = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably avoid importing this pool in the host at all, but I'm not sure that's currently feasible with disko. @Mic92?
# - root-B : (50G) | ||
# - vm-storage-A : (30G) Possible standalone pre-built VM images are stored here | ||
# - vm-storage-B : (30G) | ||
# - reserved-A : (10G) Reserved partition, no use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like with zfs, LVM affords us some flexibility here (though indeed not as much). Why were you thinking to allocate this? I could see it being more necessary with LVM than with zfs, but I'm still skeptical about its usefulness.
# - reserved-A : (10G) Reserved partition, no use | ||
# - reserved-B : (10G) | ||
# - gp-storage : (50G) General purpose storage for some common insecure cases | ||
# - recovery : (rest of the LVM pool) Recovery factory image is stored here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# - recovery : (rest of the LVM pool) Recovery factory image is stored here | |
# - recovery : (rest of the ZFS pool) Recovery factory image is stored here |
@@ -0,0 +1,181 @@ | |||
# Copyright 2022-2024 TII (SSRC) and the Ghaf contributors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we actually have this file if we're not referencing it anywhere?
New disk configuration provides grounds for upcoming features, such as AB software updates and Storage VM and many more.
The Lenovo X1 config has two LVM pools, first one is fixed-size 250G "system" partition (which is going to be encrypted) and the rest of the disk is dedicated to the StorageVM.
Description of changes
Checklist for things done
x86_64
aarch64
riscv64
nix flake check --accept-flake-config
and it passesTesting
Current upstream partition scheme is as follows:
To test new partitioning scheme, run
lenovo-x1-carbon-gen11-debug-installer
image and install Ghaf into the internal SSD storage of Lenovo-X1.After insallation is completed and the laptop is booted into the Ghaf system, check partitions with
lsblk
command.It should be as follows: