Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New permissions handling in 1.2.0 causes re-backup of files that haven't changed #988

Open
kosal75 opened this issue Apr 28, 2019 · 90 comments

Comments

@kosal75
Copy link

kosal75 commented Apr 28, 2019

I have just updated from 1.1.24 to 1.2.0 (common, gnome, qt4), and making a snapshot takes hours. When downgrading to 1.1.24, all is normal (5-6 minutes, as always).

Ubuntu 18.10 with kernel 4.18.0-18.

@ghost
Copy link

ghost commented Apr 28, 2019

Hi Germar, hope all is good! 😀
Like @kosal75 I too got updated today, and ran into many issues. These were:

  • My previous snapshots (over 2 years worth) were no longer "considered". Although they were visible in the BiT-GUI, a new update triggered a completely new "virgin" rsync of all files on all disks defined previously (see below), whether changed, new or not.
  • Having said that, $HOME/.config/backintime/config was found and recognised though and executed without errors, i.e. only files/ dirs defined in $HOME/.config/backintime/config previously were backup-ed.
  • As @kosal75 writes, the rsync is extremely slow. With my backup disk on a type C connection, write speeds are below < 1MB/s. Normally (and after downgrade) I get > 200 MB/s.
  • After the BiT/ rsync has been running for a while, my backup disk is unmounted autonomously. As a result BiT, which still runs, throws endless error notifications.

I too downgraded to one version lower available on my ppa:
sudo apt install backintime-qt4=1.1.12-2 backintime-common=1.1.12-2 backintime-gnome=1.1.12-2
sudo apt-mark hold backintime-common backintime-qt4 backintime-gnome

After this, peace and quiet has returned, all is good, and BiT performs with its previous accuracy, speed, and without flaws.

FYI: currently I'm on Xubuntu 18.04 LTS HWE (kernel 4.18.0-18).

Hope this helps...

@kosal75
Copy link
Author

kosal75 commented Apr 28, 2019 via email

@pehlm
Copy link

pehlm commented Apr 29, 2019

Thanks @BelladonnavGFH for your tip! I haven't tried to do a backup with 1.2.0 and cannot say anything about the slowness, but the old backups wasn't, as you said, recognized any longer and I panicked. I'm also on Xubuntu 18.04 and it wasn't even possible to rezise the window, it was super big! Not possible to get it smaller.
Thanks to you I have it working again. Even if I was forced to get an older version 1.1.12, the version before 1.2.0 was 1.1.24.
To the developers: I'm hoping you can get the new version to work, Back In Time is an extremely good backup-program!

@ghost
Copy link

ghost commented Apr 29, 2019

No thanks @pehlm. If it now works for you, perfect!
As a clarification: if 1.1.24 works for you, don't let me stop you using that one. The only reason I went to 1.1.12 is because that is the only version available on the ppa(s) I use(next to 1.2.0 that is of course). And as you mention, I assume this is only a temp thing, hoping to upgrade when the issues are resolved. Because, yes, I agree, BiT is very good...

@hannes101
Copy link

I can confirm the slow backup, although all my backups were found and no full backup was done. Although looking at the logs I was able to see that apparently there were many files backuped, which actually didn't change.

@catmatist
Copy link

In the Ubuntu bionic ppa for bit "stable", when bit 1.2 was added, 1.1.24 was removed. Can it be put back, please? There were many bugs fixed between the version in the official Ubuntu repositories (1.1.12?) and 1.1.24, which is why I was using the ppa. I was fortunate enough to see the change coming before it was applied, and disabled the ppa in order to keep the version I've been using until I have time to figure out if 1.2 will work for me. Which will take a lot longer if I have no easy way to get back to 1.1.24 on a system that I try 1.2 on. Thank you for working on this very useful software.

@jns-
Copy link

jns- commented May 1, 2019

After updating to 1.2.0 BiT does a (nearly) full backup because file permissions are handled differently. Before 1.2.0 all destination file permissions were set to -rw-r--r--. In 1.2.0 rsync is executed with --perms option which tells rsync to preserve the source file permission. That's why so many files seem to be changed.

@colinl
Copy link

colinl commented May 1, 2019

@jns is it actually backing them up again (and consuming extra disc space) or is it just changing the file permissions, which takes time but does not use any extra disc space?

@jns-
Copy link

jns- commented May 1, 2019

I think it's backing up the files again. My last (and first 1.2.0) backup took roughly 1 hour and created a snapshot [WITH ERRORS]. However, no error messages were written to the log file. After deleting that snapshot disk usage dropped from 35% to 19% (~300G). Unfortunately I did not check file inodes before deleting that snapshot. Considering the amount of freed disk space, I assume BiT made a complete backup, except files where permissions matched -rw-r--r--.

@catmatist
Copy link

In 1.1.24 (and I don't know how many earlier versions), there is a per-profile setting for "Full rsync mode". If you turned it on, it copied permissions to the backup directory with rsync instead of saving them in a file. I think I saw a change log comment somewhere that said 1.2 dropped support for the previous default mode and only supports "Full rsync mode". I wonder if the transition is smoother for backups that were already using "Full rsync mode"? Is there any documentation anywhere on how to make the switch to the new version without tears?

@Germar
Copy link
Member

Germar commented May 2, 2019

Hey all,
sorry for the inconvenience with version 1.2

Like @jns- already pointed out, BiT will now let rsync change the permissions directly in the snapshot. At the first snapshot with the new version this will make rsync calculate checksums for all source and all snapshot files to make sure, they are definitely the same. Which of course takes a long time. Normally rsync only compares modification-time, size and permission to tell if a file has changed. With the next snapshot rsync should be fast as before or even faster because of the optimized code.

To confirm you could compare the inodes of files that didn't change. Take a look at this FAQ: How can I check if my snapshots are incremental (using hard-links)?

Also you can check the infos rsync provides for each file it processed. From man rsync:

The attribute that is associated with each letter is as follows:

o      A  c means either that a regular file has a different checksum
       (requires --checksum) or that a symlink,  device,  or  special
       file  has a changed value.  Note that if you are sending files
       to an rsync prior to 3.0.1, this change flag will  be  present
       only for checksum-differing regular files.

o      A  s means the size of a regular file is different and will be
       updated by the file transfer.

o      A t means the modification time  is  different  and  is  being
       updated  to  the sender’s value (requires --times).  An alter‐
       nate value of T means that the modification time will  be  set
       to the transfer time, which happens when a file/symlink/device
       is updated without --times and when a symlink is  changed  and
       the  receiver  can’t set its time.  (Note: when using an rsync
       3.0.0 client, you might see the s flag combined with t instead
       of the proper T flag for this time-setting failure.)

o      A  p means the permissions are different and are being updated
       to the sender’s value (requires --perms).

o      An o means the owner is different and is being updated to  the
       sender’s value (requires --owner and super-user privileges).

o      A  g  means the group is different and is being updated to the
       sender’s value (requires --group and the authority to set  the
       group).

o      The u slot is reserved for future use.

o      The a means that the ACL information changed.

o      The x means that the extended attribute information changed.

If you have lots of files that didn't change but are marked with c this could indicate a failing harddrive. You should run fsck and consider replacing the drive.

@kosal75
Copy link
Author

kosal75 commented May 2, 2019 via email

@colinl
Copy link

colinl commented May 2, 2019

@Germar I just ran a test on Ubuntu 18.04 backing up to local drive. I created a new backup with version 1.1.24 on default settings then upgraded (via the bit team stable ppa) to 1.2.0 and ran another backup. In the log this showed every file with [C] cf...p..... and the size of the backup doubled (measured with ncdu and inode values checked just to be sure). I am not sure from your previous post whether that is expected or not.

@Germar
Copy link
Member

Germar commented May 2, 2019

And what happens on a drive not supporting Linux file permissions?

BiT still stores permissions inside the fileinfo.bz2 like in previous versions

@Germar
Copy link
Member

Germar commented May 2, 2019

In the log this showed every file with [C] cf...p..... and the size of the backup doubled (measured with ncdu and inode values checked just to be sure). I am not sure from your previous post whether that is expected or not.

No, thats not expected. Please hang on, I need to test this. Will take a bit as I need to prepare a test VM first which is an Ubuntu 16.04 updated to 18.04. Otherwise I'm not able to install 1.1.24 in 18.04

@Germar
Copy link
Member

Germar commented May 2, 2019

Hmmm

$ mkdir src dst1 dst2 dst3
$ cp backintime-1.1.24/CHANGES src/
$ rsync --no-p -t -r -v -i src/ dst1/
sending incremental file list
.d..t...... ./
>f+++++++++ CHANGES

sent 39,299 bytes  received 38 bytes  78,674.00 bytes/sec
total size is 39,185  speedup is 1.00
$ rsync --no-p -t -r -v -i src/ dst2/ --link-dest=../dst1/
sending incremental file list
.d..t...... ./

sent 63 bytes  received 19 bytes  164.00 bytes/sec
total size is 39,185  speedup is 477.87
$ rsync -p -t -r -v -i src/ dst3/ --link-dest=../dst2/
sending incremental file list
.d..t...... ./

sent 67 bytes  received 19 bytes  172.00 bytes/sec
total size is 39,185  speedup is 455.64
$ 

All fine so far. But this is stange:

$ mkdir src dst1 dst2 dst3
$ # copy the same file as before into src but this time using nautilus
$ rsync --no-p -t -r -v -i src/ dst1/
sending incremental file list
.d..t...... ./
>f+++++++++ CHANGES

sent 39,301 bytes  received 38 bytes  78,678.00 bytes/sec
total size is 39,185  speedup is 1.00
$ rsync --no-p -t -r -v -i src/ dst2/ --link-dest=../dst1/
sending incremental file list
.d..t...... ./

sent 69 bytes  received 19 bytes  176.00 bytes/sec
total size is 39,185  speedup is 445.28
$ rsync -p -t -r -v -i src/ dst3/ --link-dest=../dst2/
sending incremental file list
.d..t...... ./
cf...p..... CHANGES     <<<----- this is, what actually happens

sent 72 bytes  received 22 bytes  188.00 bytes/sec
total size is 39,185  speedup is 416.86
$ 

EDIT: removed the last part as it was wrong...

@jns-
Copy link

jns- commented May 3, 2019

No, enabling Preserve ACL makes no difference in my case. Disk usage increases rapidly.

Investigating a random file that did not change since 2017:

BiT last log entry

[I] Take snapshot (rsync: BACKINTIME: cf...p..... home/user/out.log)

[C] cf...p..... home/user/out.log

Running

$ stat /backuppath/backintime/MACHINE/user/1/*/backup/home/user/out.log

results [truncated]

  File: '/backuppath/backintime/MACHINE/user/1/20190430-071835-513/backup/home/user/out.log'
  Size: 1124      	Blocks: 8          IO Block: 4096   regular file
Device: fc00h/64512d	Inode: 83624689    Links: 25
Access: (0644/-rw-r--r--)  Uid: ( 1000/     jns)   Gid: ( 1000/     jns)
Access: 2019-05-03 07:23:28.065557278 +0200
Modify: 2017-10-04 09:11:08.408810922 +0200
Change: 2019-05-01 13:48:55.184458078 +0200
 Birth: -
  File: '/backuppath/backintime/MACHINE/user/1/last_snapshot/backup/home/user/out.log'
  Size: 1124      	Blocks: 8          IO Block: 4096   regular file
Device: fc00h/64512d	Inode: 83624689    Links: 25
Access: (0644/-rw-r--r--)  Uid: ( 1000/     jns)   Gid: ( 1000/     jns)
Access: 2019-05-03 07:23:28.065557278 +0200
Modify: 2017-10-04 09:11:08.408810922 +0200
Change: 2019-05-01 13:48:55.184458078 +0200
 Birth: -
  File: '/backuppath/backintime/MACHINE/user/1/new_snapshot/backup/home/user/out.log'
  Size: 1124      	Blocks: 8          IO Block: 4096   regular file
Device: fc00h/64512d	Inode: 83758208    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/     jns)   Gid: ( 1000/     jns)
Access: 2019-05-03 07:23:28.065557278 +0200
Modify: 2017-10-04 09:11:08.408810922 +0200
Change: 2019-05-03 07:23:28.065557278 +0200
 Birth: -

Inode of that file is 83624689 for all past snapshots and changes to 83758208 in new_snapshot.
As I mentioned before the problem seems to be the destination permissions set to (0644/-rw-r--r--) for pre 1.2.0 snapshots and setting it to the actual source file permission (0664/-rw-rw-r--) in 1.2.0 snapshots.

@jns-
Copy link

jns- commented May 3, 2019

What brought me back to business is following:
I changed permissions of files in the last_snapshot folder to match source file settings by extracting that information from fileinfo.bz2. After that backing up with 1.2.0 onto a <1.2.0 snapshot was blazingly fast and no considerable disk space was used.

I do not yet know if this procedure creates any regrettable side effects!

@ghost
Copy link

ghost commented May 3, 2019

@jns- Whaa! You lost me there... What exactly did you do? If I take a look at fileinfo.bz2 I see:
33188 bella bella /media/bella/TikiStorage/foo.jpg
So what exactly did you do now? Did you change it in fileinfo.bz2, or did you change the permissions of the files themselves in last_snapshot? And if you did the latter, I assume you didn't do this by hand, one by one? Please tell, would love to try it myself in a VM...

@Germar
Copy link
Member

Germar commented May 3, 2019

No, enabling Preserve ACL makes no difference in my case. Disk usage increases rapidly.

Ja, I realized my mistake after laying my head on the pillow last night. So I turned out again and deleted the last part of my comment 😆 Today I'd say you should rather ignore my last post at all 🙈

I did some further testing which now shows the root of the problem:

$ mkdir src dst1
$ echo "bar" > src/foo
$ cp -al src/foo dst1/foo
$ ls -lai src/foo dst1/foo
6183473 -rw-r--r-- 2 germar germar 4 Mai  3 22:36 dst1/foo
6183473 -rw-r--r-- 2 germar germar 4 Mai  3 22:36 src/foo
$ chmod 664 src/foo 
$ ls -lai src/foo dst1/foo
6183473 -rw-rw-r-- 2 germar germar 4 Mai  3 22:36 dst1/foo
6183473 -rw-rw-r-- 2 germar germar 4 Mai  3 22:36 src/foo
$ 

So, changing permissions of one hard-link will change permissions in all hard-links. Which, after knowing the fact, is totally logical because a hard-link is just a pointer to the inode. It doesn't store anything else. Not even permissions.
And that's why rsync creates a new inode even if there was just a change in permissions...

@jns-
Copy link

jns- commented May 4, 2019

In my previous posts I used the term 'pre 1.2.0' for BiT versions older than 1.2.0. From now on let's assume it was version 1.1.4 for the sake of better readability.

My intention was to prevent BiT 1.2.0 to create a full backup on top of my 1.1.4 snapshot history. The idea was to modify the 1.1.4 snapshot history outside of BiT and make it look ok to BiT 1.2.0 next time it is started.

Like @Germar explains, hard links are just pointers to files on the disk and do not store any additional information. Permissions are part of the file the hard link is pointing to.

BiT crates a snapshot folder called last_snapshot. From that folder the latest version of all backup files can be accessed (via hard links), no matter in what snapshot the files were written to the disk. The original source file permissions are stored in fileinfo.bz2 for every snapshot taken. The permissions are somewhat hidden in the first number on each line.

For example in

33204 user group /home/user/file.txt

33204 in octal representation is 100664. The last 3 digits represent the original file permission, which corresponds to -rw-rw-r--. All we have to do is to grab those last 3 digits and chmod 664 /bit_backup_path/last_snapshot/backup/home/user/file.txt. After that the backed up file has the same permissions like its source file and BiT 1.2.0 no longer thinks it needs backing it up again.

What I did in detail:

  1. Be very careful and consider backing up your backup files. That should be done with something like
    rsync -a -H --delete --progress /bit_backup/ /backup_of_bit_backup/
    -H is important to preserve the BiT hard link structure.

  2. Start BiT and delete all snapshots created with BiT 1.2.0. Quit BiT. Now you have a 'clean' 1.1.4 snapshot history.

  3. Decompress fileinfo.bz2 from last_snapshot folder to fileinfo.txt and put that file into your home directory.

  4. Now, modify all permissions of files in in last_snapshot according to information grabbed from fileinfo.bz2.

  5. Start BiT 1.2.0 and take a snapshot. It should now create a regular incremental snapshot.

I used

cat fileinfo.txt | while read line; do perm=`echo "${line}" | cut -d" " -f1`; path_=`echo ${line} | cut -d" " -f4-`; perm=$(([##8]perm)); perm="${perm:$((${#perm}-3)):3}"; echo chmod "$perm" "/bit_backup_path/last_snapshot/backup$path_"; done

for step 4. (works in Z shell, probably not in bash). The last echo before chmod has to be removed to effectively change file permissions. Otherwise the command is just echoed to your console.

I recommend to test permission modification first on a subset of backup files by truncating fileinfo.txt. Probably you can find some useless files in your backups to experiment with, for example .Fonts folder in your home directory or similar.

Running the one liner may take some time (~350GB corresponding to 200k files, took ~15min on my laptop). Optimize it according to your needs and show us something way more elegand 😏

Note 1: As I mentioned in a post above, I do not know if there are any side effects.

Note 2: This procedure does not clean up your complete snapshot history. It only affects the latest version of your backup files. If you restore a file from far in the past the permissions will most probably be wrong.

@colinl
Copy link

colinl commented May 4, 2019

@jns I can confirm that procedure works for me on Ubuntu, thanks. I had to first install zsh to run it (which is no big deal), it would be nice if someone more knowledgeable that myself could do an sh or bash version in order to avoid this necessity. I can suggest a slight enhancement which avoids manually unpacking the file. The procedure is:
Ensure bzip2 is installed (sudo apt-get install bzip2)
Ensure zsh is installed (sudo apt-get install zsh)
Run zsh from a terminal then

cd path/to/backups/last_snapshot
cat fileinfo.bz2 | bzip2 -d | while read line; do perm=`echo "${line}" | cut -d" " -f1`; path_=`echo ${line} | cut -d" " -f4-`; perm=$(([##8]perm)); perm="${perm:$((${#perm}-3)):3}"; echo chmod "$perm" "backup$path_"; done

As with @jns's version that will just echo all the commands it will perform, once happy that looks correct then repeat the command but without the last echo so

cat fileinfo.bz2 | bzip2 -d | while read line; do perm=`echo "${line}" | cut -d" " -f1`; path_=`echo ${line} | cut -d" " -f4-`; perm=$(([##8]perm)); perm="${perm:$((${#perm}-3)):3}"; chmod "$perm" "backup$path_"; done

You can do that by hitting up arrow then left arrow to the echo text and delete the word echo.

@ghost
Copy link

ghost commented May 4, 2019

Thanks @jns- and @colinl for sharing this! I'm sure I'll give it a try too (as soon as I can get these pesky little humans squatting my box to play SuperTuxKart to go away).

@Germar
Copy link
Member

Germar commented May 5, 2019

Thanks @jns- and @colinl for the fix. I'm planing to implement this to run automatically. But as I'm on vacation for the next weeks this will take some time...

If you restore a file from far in the past the permissions will most probably be wrong.

BiT will still restore permissions from fileinfo.bz2 after restore. So permissions will be correct no matter how old the snapshot was.

@wiregrasscoder
Copy link

wiregrasscoder commented May 9, 2019

Building on the work of @jns- and @colinl, I added user and group. For my last 1.1.4 backup, this was necessary to prevent another full backup.

cat fileinfo.bz2 | bzip2 -d | while read line; do perm=`echo "${line}" | cut -d" " -f1`; path_=`echo ${line} | cut -d" " -f4-`; user_=`echo "${line}" | cut -d" " -f2`; group_=`echo "${line}" | cut -d" " -f3`;  perm=$(([##8]perm)); perm="${perm:$((${#perm}-3)):3}"; chmod "$perm" "backup$path_"; chown "$user_":"$group_" "backup$path_"; done

@wiregrasscoder
Copy link

wiregrasscoder commented May 9, 2019

Upon further review, something is wonky beyond just correcting the permissions, owners, and groups in the last good 1.1.4 snapshot. I've gone through by hand and confirmed that the permissions, owners, and groups in my source files are identical to those files in last_snapshot, but nonetheless, some files continue to be backed up in full (files that haven't changed in years). Running the above command did prevent many of the files from being backed up in full again, but it was not a panacea.

@colinl
Copy link

colinl commented May 9, 2019

@wiregrasscoder Do the problematic files have unusual permissions in the source? I am seeing it with r--r--r--. If so perhaps you are falling over issue #994
I would also be interested in knowing whether anyone else is seeing issue #993. I find that if a delete a file and make a snapshot that it does not delete the file from the snapshot.

@wiregrasscoder
Copy link

@colinl I have not seen the r--r--r-- issue. The permissions on the files in my scenario do have setuid and setgid applied. Also I'm also seeing that after a certain point during the snapshot, the snapshot size begins to grow very quickly, and log entries stop, so I'm not sure what's happening.

@Cyber1000
Copy link

Hi, same problem here: updated from 1.1.24 to 1.2.0.
My config file contains "profile1.snapshots.preserve_acl=false" as stated in example-config https://github.com/bit-team/backintime/blob/master/common/config-example-local
Is this still used with 1.2.0?

In https://github.com/bit-team/backintime/blob/master/common/tools.py I found a no_perms flag, how can I use this, is this something I could write in my config?
profile1.snapshots.no_perms=false (not tested, just a first guess)?

Would the behaviour of 1.2.0 with this config-entry be the same as with pre 1.2.0 (without groups, users, perms).

It would be fine for me to have permissions not synced.

Thanks!

@buhtz
Copy link
Member

buhtz commented Feb 23, 2023

Maybe after adding --no-perms --no-group --no-owner or removing it again.

In my environment I never modified the rsync arguments and always used the defaults.

@emtiu
Copy link
Member

emtiu commented Feb 23, 2023

We might have a better shot at isolating the problem if we focused on #994 first. I have a feeling that the root cause is the same, I've seen #994 in the wild myself, and it has more deterministic triggering conditions.

@buhtz
Copy link
Member

buhtz commented May 14, 2023

Just a quick n dirty note: I realized that there is a --chmod=Du+wx in our rsync call (debug output from a SSH snapshot profile). Not sure but the D indicates it affects directories only. I don't know why it is there and when it was put in there. Should investigate further.

@buhtz buhtz unpinned this issue Jun 26, 2023
@buhtz
Copy link
Member

buhtz commented Jun 26, 2023

I do read the whole issue thread and wonder if I got this right.

We have a lot of comments here and there are multiple issues and problems addressed. Other relevant tickets are linked. But the problem described here happens only when migrating from <1.2 to >=1.2 BIT. Am I right so far?

@emtiu
Copy link
Member

emtiu commented Jun 26, 2023

We have a lot of comments here and there are multiple issues and problems addressed. Other relevant tickets are linked. But the problem described here happens only when migrating from <1.2 to >=1.2 BIT. Am I right so far?

To my knowledge, yes, that's correct.

I can't remember if new installations of >=1.2 are also affected, so we might treat that as a "maybe" for the moment.

@danielaixer
Copy link

danielaixer commented Jul 11, 2023

I do read the whole issue thread and wonder if I got this right.

We have a lot of comments here and there are multiple issues and problems addressed. Other relevant tickets are linked. But the problem described here happens only when migrating from <1.2 to >=1.2 BIT. Am I right so far?

That might be the case. I've recently started using Ubuntu 20.04 with BIT 1.2.1-2 and I have this issue. Adding the PPA and upgrading to BIT 1.3.3-3 hasn't helped. I detected the issue because the same backup on the very same machine was taking wayyy longer, and then I noticed that the target drive was getting filled up way faster than it should.

My old setup (that I can still boot into) is Ubuntu 14.04 with BIT 1.0.34 where the same backup profile still works fine, even with more included paths. I don't think this matters, but one of the source paths is an NTFS drive. However, the target is EXT4, so there should not be permission issues specific to my case.

As @emtiu, I can also confirm that the workaround of adding --no-perms --no-group --no-owner is effective. I was going nuts, so thank you a lot.

@aryoda
Copy link
Contributor

aryoda commented Sep 7, 2023

And what happens on a drive not supporting Linux file permissions?

BiT still stores permissions inside the fileinfo.bz2 like in previous versions

But is --perms still used then (which may cause a full recopy of the source files if the target file system has a different "umask").

I think we should test this to see the behavior (full recopy or not).

@aryoda
Copy link
Contributor

aryoda commented Sep 25, 2023

Summary:

I vote to undo the new permission handling (--perms --groups --owner --executability options) introduced in BiT v1.2.0 to get rid of major issues related to this change:

Reasons

The intentions of the new permission handling were

  1. to let rsync instead of BiT handle the backup of permissions
  2. perhaps also to get rid of BiT's own handling of permission backups in the fileinfo.bz2 file
    to allow restores even without BiT (just by using rsync or cp).
    This objective is not achievable since we need fileinfo.bz2 for target file systems that do not support the same
    permissions like the source folders (or have different users and groups).
  3. perhaps also to protect the access to files in the backup with the same permissions as in the source
    (also not achievable, see the prev. point)

Effectively the new permissions handling led to problems like

  1. a full backup of all files (=duplicated) in the first snapshot after updating to BiT v1.2.0++ if older snapshots pre v1.2.0 were taken without the old full sync mode setting. This takes quite long and wastes disk space on the backup target
  2. every change of file permissions leads to a new copy of the file in the next snapshot (not hardlinked!) even though the file itself is unmodified (permissions are metadata)
  3. being affected by an bug in rsync (open since 2017): Deleting hardlinks during deleting an old snapshot resets the file permissions of the same hardlinked file in other snapshots (causing a full copy of the file in the next snapshot due to "changed permissions"). The "smart remove" feature of BiT triggers this unwanted behavior whenever an old snapshot is deleted.
    See Files with permissions r--r--r-- being repeatedly backed up #994 (comment)
  4. Edit: Mount options of the backup target may interfere with rsyncs permissions transfer (eg. for SMB and NTFS-g3 it is possible to specify user=...,group=...,umask=...,dmask=... so the permissions are almost ever different between source and target causing a full copy in every snapshot instead of using hardlinks). See eg. "Remote host doesn't support hardlinks" but manual hardlinks are possible #1164.

Alternatives

  1. Introducing the new permission handling did unintentionally break the backup semantics of BiT so it would be good to

    • make the old permission handling semantics the default again
    • also let the user decide if and when to use the new permission handling

We have PR #1086 for this (thanks to @b3nmore for preparing this PR!).

  1. Fix only Files with permissions r--r--r-- being repeatedly backed up #994 by using rm -rf instead of rsync --delete (as workaround for the rsync bug).

This is only a partial fix, requires a lot of scenario testings and would not solve the other issues (slow first snapshot; full file copy if permissions are changed).

Impact of a non-fix

  • Extremely slow first snapshot after updating to v1.2.0++
  • Waste of disk space at the target drive
  • Slower backups due to "unnecessary" file copies
  • No real value added (unless we could really get rid of fileinfo.bz2 and let rsync handle permissions)
  • Loosing users

Next steps

@Germar @buhtz @emtiu I think it is time now to take a decision here -> RfC 😄

@emtiu
Copy link
Member

emtiu commented Sep 25, 2023

Thank you for the deep analysis, @aryoda, I agree with every point of it.

Your proposed solution also minimizes the necessary testing of the handling of existing backups, because there will only be a few cases:

  1. existing backups from <1.2.0: no change
  2. existing backups from >=1.2.0 with the popular --no-perms --no-group --no-owner workaround: almost no change (settings handling only)
  3. existing backups from >=1.2.0 default handling: testing needed

Since this is an "existential" issue for BiT, I think @Germar's input is especially important.

@buhtz
Copy link
Member

buhtz commented Sep 25, 2023

Awesome work! Thanks a lot for diving into this. ❤️
As a disclaimer I have to say I do not understand all details. But based on your summary I would support your proposal. 🚀

One question:
It seems that the rsync-upstream bug (https://bugzilla.samba.org/show_bug.cgi?id=12806) is not recognized by rsyncs upstream maintainer Wayne Davison.

Jürgen, did you contacted Wayne about our issues? And did you point him to his own upstream bug?

Second question:
Would it solve our problems if the upstream bug would be fixed?

@emtiu
Copy link
Member

emtiu commented Sep 25, 2023

Would it solve our problems if the upstream bug would be fixed?

It would definitely solve #994. We don't understand #988 and #1437 well enough yet to say. Maybe the upstream fix would resolve those, maybe not.

In any case, it would potentially take a long time for the fixed version of rsync to appear in all distros. Changing/fixing the behavior of BiT is much more under our control.

The only point that makes me a little nervous is the handling of existing backups. We need to be very thorough in testing that. But I think that's within our capabilities.

@aryoda
Copy link
Contributor

aryoda commented Sep 25, 2023

did you contacted Wayne about our issues? And did you point him to his own upstream bug?

Not yet, I have just sent a public request at the rsync mailing list (but no developer responded so far)
and have then bumped the issue by adding my script to reproduce the bug (also no response so far).

Second question: Would it solve our problems if the upstream bug would be fixed?

As @emtiu wrote: Not reliably. Furthermore it does not fix things I have just added in my above analysis:
-> Permission mappings in the mount options cause permanent re-backups (very in-transparent to the end user!).

The only point that makes me a little nervous is the handling of existing backups. We need to be very thorough in testing that. But I think that's within our capabilities.

Yes, it needs testing, but it is a direct "downgrade" (= does no longer treat permissions changes as a change) so it seems much less risky than keeping the new permissions handling.

@emtiu
Copy link
Member

emtiu commented Oct 2, 2023

I vote to undo the new permission handling (--perms --groups --owner --executability options) introduced in BiT v1.2.0 to get rid of major issues related to this change:

How about someone create an experimental branch that implements this fix/revert? It would be very useful for testing, and we're going to need a lot of testing :)

@aryoda
Copy link
Contributor

aryoda commented Oct 2, 2023

How about someone create an experimental branch that implements this fix/revert?

I think I can do this but I need some time (I guess until end of October - I am in the middle of another major roll-out)

It would be very useful for testing

We need a test plan for that with a matrix of backup and restore scenarios:

  • With- and without existing snapshots
  • Existing snapshots with old or new permission handling
  • New snapshots with old or new permission handling
  • Different file systems and mount options
  • With and without ssh
  • Different UIDs and GIDs between source and target mounts (for backups as well as restores)
  • ...

We then should automate these tests

  • simulated changes of source data files with a backup in each step
  • restore to check if the permissions and owner are kept as expected

Eg. my MRE bash script test.sh does this just to reproduce the rsync bug and could be used as a basis for test automation (or any other scripting language).

@emtiu
Copy link
Member

emtiu commented Oct 2, 2023

  • With- and without existing snapshots
  • Existing snapshots with old or new permission handling
  • New snapshots with old or new permission handling

I would even go so far as to do some tests with real, large datasets on real hard-drives. Some problems you only notice in "real world" scenarios. I have plenty of large external drives lying around to do that :)

@buhtz
Copy link
Member

buhtz commented Oct 2, 2023

I fully agree that we should do heave real-world testing here.
Of course I'll support it when the time comes.

@capybara-overdose

This comment was marked as abuse.

@aryoda
Copy link
Contributor

aryoda commented Feb 15, 2024

I think we could automate testing of the snapshot source and target folders quite "easily" (not only for this issue) similar to how the rsync test suites work:

https://github.com/WayneD/rsync/blob/2f9b963abaa52e44891180fe6c0d1c2219f6686d/testsuite/rsync.fns#L247

It basically uses diff to compare

  • the directory listings
  • and files in the directory (diff -r)

@buhtz
Copy link
Member

buhtz commented Feb 16, 2024

F***ing awesome! 🥳 🎉 🪅 Never realized that rsync itself could have a test suite. This is a very good "documentation" of its behavior.
I see light at the end of the tunnel... 🌞

@ACAwebbuilder
Copy link

ACAwebbuilder commented May 13, 2024

Hi! I found this bug while trying to find a solution to an issue I am having with BiT 1.2.1. I upgraded my system from Ubuntu 18.04 to Ubuntu Server 22.04. I then installed BiT 1.2.1 (which was an upgrade for me). I ran the initial scan and it went well. Since then, it is running a full scan frequently (not every time, but most times). The previous version just updated what had changed.

Is this happening because the bug wasn't fixed before this update? Or is there possibly something else going on?

Thank you for any thoughts/suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests