You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
restic 0.12.0 compiled with go1.15.8 on linux/amd64
What should restic do differently? Which functionality do you think we should add?
Now restic when backing up some data, will compare the size, mtime, ctime and the inode of a file instead of fully read it. There are some storage platforms (like Ceph and EOS) that recursivelly update some extended attributes when a file changes to the root parents. This could be exploited in order to discard some folders as we are 100% sure that from the previous snapshot the folder (and avery files in it, no matter how much deeply) did not change. So, it could be nice have an option (like proposed in #2902) for comparing an xattr specified by the user, that is checked against a folder as well. This functionality has two main advantages:
skipping folders that we already know were not updated since the last snapshot will save a lot of time, expecially if you have millions of resources in that folder
reduces the load on production storage instances
In addition, because restic for the time being only stores in a new snapshot the processed files (so adding a feature like skipping a folder by an extended attribute means skipping a sub tree in the new snapshot), could be nice to have a merge option, that will merge the new snapshot with a parent one. This can be helpful while restoring a full folder, or a part of it, from a snapshot without checking whether the previous snapshots have additional files/folders that were skipped in the current snapshot we are using for restoring. Also, this can be useful as a prune could delete these older snapshots that contains useful resources not included in the newer snapshots. This was also already mentioned in the forum, like in https://forum.restic.net/t/merge-restic-snapshots/4364, or https://forum.restic.net/t/backup-parent-behavior/3286.
What are you trying to do? What problem would this solve?
This will save a lot of time backing up really big folders and reduce the load on instances used everyday by thousands of users.
Did restic help you today? Did it make you happy in any way?
Restic is a fantastic tool we are using at CERN to backup every day ~40K home and project directories of our users, resulting in >300M of new files every day, for a total of 4PB of data being backed up.
The text was updated successfully, but these errors were encountered:
Allowing the archiver part of the backup command to skip directories based on an extended attribute would be fairly easy to implement. However, I currently see two problems that would have to be solved:
The backup command consists of a scanner which just counts how many files have to be backed up (only for statistics!) and the actual archiver component which backs up everything. The scanner currently does not use the parent snapshot in any way such that it also won't be able to skip directories based on an extended attribute. It would probably be possible to add that functionality to the scanner, but we'd have to ensure that it doesn't regress performance when not using the extended attributes.
When skipping directories then the statistics won't add up. That would probably be acceptable. Reconstructing the statistics based on the directory metadata from the backup repository is probably overkill.
could be nice to have a merge option, that will merge the new snapshot with a parent one
I've seen #3405 which I plan to look at whenever I get around to look at snapshot rewriting and similar functionalities. But I can't make any promises when that will happen.
Output of
restic version
restic 0.12.0 compiled with go1.15.8 on linux/amd64
What should restic do differently? Which functionality do you think we should add?
Now restic when backing up some data, will compare the size, mtime, ctime and the inode of a file instead of fully read it. There are some storage platforms (like Ceph and EOS) that recursivelly update some extended attributes when a file changes to the root parents. This could be exploited in order to discard some folders as we are 100% sure that from the previous snapshot the folder (and avery files in it, no matter how much deeply) did not change. So, it could be nice have an option (like proposed in #2902) for comparing an xattr specified by the user, that is checked against a folder as well. This functionality has two main advantages:
In addition, because restic for the time being only stores in a new snapshot the processed files (so adding a feature like skipping a folder by an extended attribute means skipping a sub tree in the new snapshot), could be nice to have a merge option, that will merge the new snapshot with a parent one. This can be helpful while restoring a full folder, or a part of it, from a snapshot without checking whether the previous snapshots have additional files/folders that were skipped in the current snapshot we are using for restoring. Also, this can be useful as a prune could delete these older snapshots that contains useful resources not included in the newer snapshots. This was also already mentioned in the forum, like in https://forum.restic.net/t/merge-restic-snapshots/4364, or https://forum.restic.net/t/backup-parent-behavior/3286.
What are you trying to do? What problem would this solve?
This will save a lot of time backing up really big folders and reduce the load on instances used everyday by thousands of users.
Did restic help you today? Did it make you happy in any way?
Restic is a fantastic tool we are using at CERN to backup every day ~40K home and project directories of our users, resulting in >300M of new files every day, for a total of 4PB of data being backed up.
The text was updated successfully, but these errors were encountered: