New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically include mounted btrfs subvolumes in NETFS backups #3175
base: master
Are you sure you want to change the base?
Conversation
@lzaoral I will test how it behaves on SLES systems Offhandedly I think the main problem is I.e. when there are mounted btrfs "thingies" For example a disklayout.conf on SLES15-SP5 (excerpts)
For example assume in addition to @/.snapshots/1/snapshot Then a 'tar' backup would contain the system files
So the backup would be basically about three times But during 'tar' restore there is no deduplication |
I am not a btrfs expert at all, but is it possible to distinguish snapshot subvolumes from "normal" (non-snapshot) subvolumes? Then we could save this subvolume metadata (snapshot yes/no) and do something based on the information when recreating and restoring. First we would probably just skip snapshots, later we could do something more intelligent if possible. |
Only an offhanded thought: I fear btrfs normal subvolumes versus btrfs snapshot subvolumes I think it is in general possible that one same When '/here' and '/there' are included in a 'tar' backup Tomorrow I will experiment a bit with that. |
Perhaps we can by default have every mounted "thingy" |
Currently I am exploring how 'tar' behaves in general It seems 'tar' behaves well forgiving in this case:
So perhaps only mounted btrfs snapshot subvolumes I think it is OK if a user mounts the same stuff |
I disagree that it is an example of this problem. Snapshots are not the same thing mounted at different places. They are different things mounted at different places - snapshots exist because their content is (at least in principle) different. One filesystem mounted at more places can occur as well, and it will result in an explosion of backup data, but it restoring it twice then should not result in an increase of the size of the restored system, only in a slower restore, because you keep restoring to the same filesystem. I would not try to solve these two problems in the same way (cf. RFC 1925 item 5). |
It thinks that the doubled file names are different names for the same files (i.e. hardlinks), which is not entirely correct - not sure if it can have some unwanted consequences or not. |
Yes. The whole point of my experiments with 'tar' here If it is actually only one root problem If it is actually several separated problems then #3175 (comment) From my experiments with 'tar' in the past I know that |
That's an interesting idea. For multiple identical arguments to
Is the same happening with different btrfs snapshots mounted at different mountpoints? I.e. does tar consider files in different snapshots (originally same, but possibly different when they have been modified since the snapshot was taken) as hardlinks to the same file? |
Thank you for the feedback, @jsmeix! I'll amend the code to skip backup of all mounted btrfs snapshot subvolumes. The duplication of files in backup when a filesystem/btrfs subvolume is mounted more than once is a different (though related) issue, therefore, I suggest to resolve it separately. |
In backup/NETFS/default/500_make_backup.sh use without_subsequent_duplicates $TMP_DIR/backup-include.txt to ignore duplicate arguments provided to 'tar' and 'rsync' what should be archived to avoid that 'tar' and 'rsync' archive exact same things multiple times which needlessly increases backup time and in case of 'tar' the backup archive size and storage space and backup restore time, cf. #3175 (comment)
I tested it with SLES15-SP5
I was in particular interested how things behave
manual setting in etc/rear/local.conf
With that I got duplicated things in the backup.tar.gz To make ReaR behave backward compatible for SLES users With this additional changes I get no longer @lzaoral @pcahyna @rear/contributors |
@jsmeix if you have snapshots, can you please test #3175 (comment) : "does tar consider files in different snapshots (originally same, but possibly different when they have been modified since the snapshot was taken) as hardlinks to the same file?" ? |
Is it a regression with this PR, or did you get duplicated entries in backup.tar.gz even before? What are the duplicated entries? Aren't you missing |
I tested ZFS and the same files in a snapshot and in the original filesystem do not show up as hardlinks to the same file in the |
Hi all, reviewing what needs to be done there.
|
@pcahyna First I would like to implement |
In lib/global-functions.sh added a new unique_unsorted() function that outputs lines in a file or from STDIN without subsequent duplicate lines which keeps the ordering of the lines, see #3177 In backup/NETFS/default/500_make_backup.sh use unique_unsorted $TMP_DIR/backup-include.txt to ignore duplicate arguments provided to 'tar' and 'rsync' what should be archived to avoid that 'tar' and 'rsync' archive exact same things multiple times which needlessly increases backup time and in case of 'tar' the backup archive size and storage space and backup restore time, cf. #3175 (comment)
@jsmeix Thank you for implementing |
fbfc80b
to
3690ead
Compare
3690ead
to
e693831
Compare
@lzaoral
Copy of my
Here you must use
because here STDOUT gets written into disklayout.conf
Yes - I know - that is horrible coding style |
... unless they are explicitly excluded. Resolves: rear#2928
Otherwise, the component itself would not be included if it had any child components of the `fs` type.
e693831
to
3fe88cc
Compare
The automatic exclusion of snapshots from backups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From plain looking at the code I think it is OK
so I approve it "bona fide".
I will test it later on SLES as time permits
(i.e. that could happen after it was merged).
Overhauled 400_create_include_exclude_files.sh Now do first backup the mounted filesystems to backup '/' first so the basic system files get stored first in the backup and then backup what is specified in BACKUP_PROG_INCLUDE see #3177 (comment) and #3217 (comment) Report suspicious cases as LogPrintError to have the user at least informed. Remove duplicates but keep the ordering. to avoid possibly unwanted and unexpected subtle consequences see #3175 (comment) Verify that at least '/' is in backup-include.txt see #3217 Redirect stdout into files exactly at the command where needed instead of more global redirections, cf. "horrible coding style" in #3175 (comment)
I tested "rear -D mkbackup" Details: Disk layout:
SUSE default btrfs structure:
etc/rear/local.conf
The duplicates in BACKUP_PROG_INCLUDE and BACKUP_PROG_EXCLUDE disklayout.conf
What I get in the backup included and excluded
The only thing that worried me is /.snapshots
looks scaring - more than 300 thousand files
so what actually matters is
which looks much better
so when /.snapshots gets included in the backup BUT
got recreated during disk layout recreation
I will test what happens when I do "rear recover" |
Thank you for the feedback, @jsmeix! In that case, it might be a good idea to autoexclude edit: The snapper btfrs subvolume is handled here:
|
Test what happens when I do "rear recover"
So /.snapshots must be excluded from the backup restore @lzaoral |
Mainly for my own information here Excerpts from /var/log/rear/rear-localhost.log
Indeed there is no var/tmp in the recreated system
My current guess is that there is no var/tmp
but what is a real bug is that
because during "rear recover" TMPDIR should not be set at all
I think I found the root cause why there is no var/tmp
i.e. restore/default/900_create_missing_directories.sh
which is true at that point in time |
I re-did "rear -D mkbackup"
I also did
to verify that the change in After "rear -D mkbackup" I got in particular
note therein /.snapshots (because it is a mountpoint) With that I re-did "rear -D recover"
In the recreated sytem I got in particular
which looks perfectly right now. I rebooted the recreated system
(15:05:03 UTC equals 05:05:03 PM CEST) |
@lzaoral Therefore I also assigned this pull request to him I could also merge it but I already reviewed it and |
Overhauled backup/NETFS/default/400_create_include_exclude_files.sh * Now do first backup the mounted filesystems to backup '/' first so the basic system files get stored first in the backup and then backup what is specified in BACKUP_PROG_INCLUDE see #3177 (comment) and #3217 (comment) * Report suspicious cases as LogPrintError to have the user at least informed. * Remove duplicates in backup-[in/ex]clude.txt but keep the ordering to avoid possibly unwanted and unexpected subtle consequences see #3175 (comment) * Verify that at least '/' is in backup-include.txt see #3217 * Redirect stdout into files exactly at the command where needed instead of more global redirections, cf. "horrible coding style" in #3175 (comment) Update backup/NETFS/default/500_make_backup.sh * In backup/NETFS/default/500_make_backup.sh unique_unsorted is no longer needed because backup-include.txt is already without duplicates because unique_unsorted is now called in backup/NETFS/default/400_create_include_exclude_files.sh
Oops! |
Test when one has other snapshot subvolumes mounted I mount btrfs snapshot 2 at /snapshot2 and
Did again "rear -D mkbackup" as before in
i.e. without explicitly excluding the mounted btrfs snapshots. backup.tar.gz size from before in disklayout.conf is now
in particular the disabled btrfs entries
The only things that backup.tar.gz contains
After "rear -D recover"
Also 'df -h' looks well
for comparison on the original system
|
Mainly for my own information here In the recreated system '/tmp/' has wrong permissions
for comparison on the original system
Excepts from the "rear -D recover" log file
In backup.tar.gz there is neither 'tmp' nor 'var/tmp' I think I found the root cause why /mnt/local/tmp
so in .../var/lib/rear/layout/diskrestore.sh there is
This happens because on the original system there is
The cause is my
and the root cause is my ignorance
so with
all works reasonably well
because now in my backup.tar.gz there is
so /tmp/ and /var/tmp/ get restored But because of this I found out that Via |
In default.conf add '/var/tmp/rear.*' to BACKUP_PROG_EXCLUDE because since ReaR uses /var/tmp/rear.* as BUILD_DIR one would get at least the whole BUILD_DIR of the current "rear mkbackup" run in the backup by default, see at the end of #3175 (comment)
In default.conf add '/var/tmp/rear.*' to BACKUP_PROG_EXCLUDE because since ReaR uses /var/tmp/rear.* as BUILD_DIR one would get at least the whole BUILD_DIR of the current "rear mkbackup" run in the backup by default, see at the end of #3175 (comment) Additionally describe why ReaR's VAR_DIR/output is excluded. Also describe why the '/directory/*' form is used.
@jsmeix No worries, the exclusion of snapper base subvolume is quite simple. Could you please test the following patch on SLES? Thank you! diff --git a/usr/share/rear/layout/save/GNU/Linux/230_filesystem_layout.sh b/usr/share/rear/layout/save/GNU/Linux/230_filesystem_layout.sh
index cdeca6de..d34a4881 100644
--- a/usr/share/rear/layout/save/GNU/Linux/230_filesystem_layout.sh
+++ b/usr/share/rear/layout/save/GNU/Linux/230_filesystem_layout.sh
@@ -467,12 +467,14 @@ fi
# see https://btrfs.wiki.kernel.org/index.php/Mount_options
test "/" != "$btrfs_subvolume_path" && btrfs_subvolume_path=${btrfs_subvolume_path#/}
+ # Automatically exclude all mounted snapper and snapshot subvolumes from the backup.
+ # See https://github.com/rear/rear/pull/3175#issuecomment-1983498175 and
+ # https://github.com/rear/rear/pull/3175#issuecomment-2111776529
+ if test "$snapper_base_subvolume" = "$btrfs_subvolume_path" || btrfs_snapshot_subvolume_exists "$subvolume_mountpoint" "$btrfs_subvolume_path"; then
+ echo "#btrfsmountedsubvol $device $subvolume_mountpoint $mount_options $btrfs_subvolume_path"
# Finally, test whether the btrfs subvolume listed as mounted actually exists. A running docker
# daemon apparently can convince the system to list a non-existing btrfs volume as mounted.
# See https://github.com/rear/rear/issues/1496
- if btrfs_snapshot_subvolume_exists "$subvolume_mountpoint" "$btrfs_subvolume_path"; then
- # Exclude mounted snapshot subvolumes
- echo "#btrfsmountedsubvol $device $subvolume_mountpoint $mount_options $btrfs_subvolume_path"
elif btrfs_subvolume_exists "$subvolume_mountpoint" "$btrfs_subvolume_path"; then
echo "btrfsmountedsubvol $device $subvolume_mountpoint $mount_options $btrfs_subvolume_path"
else
|
Pull Request Details:
Type: Enhancement
Impact: High
Reference to related issue (URL): NETFS tar backup no btrfs subvolumes by default #2928
How was this pull request tested?
rear savelayout
and manual inspection of generated files and backup/restore of a Fedora Rawhide machineDescription of the changes in this pull request:
$RESTORE_EXCLUDE_FILE