Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way to merge two (or more) restic repos? #1225

Closed
bherila opened this issue Sep 9, 2017 · 7 comments
Closed

Is there any way to merge two (or more) restic repos? #1225

bherila opened this issue Sep 9, 2017 · 7 comments
Labels
type: feature suggestion suggesting a new feature

Comments

@bherila
Copy link

bherila commented Sep 9, 2017

I am wondering if there is a logical way to merge two repositories that have the same password and contain duplicate data? Ideally this would produce the union of the two snapshots, and could be used to "catch up" an online backup location that was disconnected for a while. Perhaps this is related to #1040 but I'm honestly (very new to restic) not sure.

Thanks very much in advance for your advice!

@bherila bherila changed the title Is there any way to merge two (or more) repos? Is there any way to merge two (or more) restic repos? Sep 9, 2017
@paul-t-t
Copy link

If I understand correctly, there's no way to do this that wouldn't involve effectively restoring from one repository and backing up to the other.

Even if you set up the repository to have the same password, the underlying encryption key (which is stored in the repository and encrypted using the password -- see the design docs) and chunker polynomial will be different, which would mean that the contents have to be decrypted and reassembled before being written into the other repository.

I think it would be possible in principle to write an external tool that would mount one repository, then walk through each of the snapshots, backing up the contents to the other.

@fd0
Copy link
Member

fd0 commented Sep 13, 2017

There's no way to merge two repositories, and as @paul-t-t pointed out even if the repos have the same password, the underlying master crypto keys are still different. There are ideas, however, to add a copy command to transfer data from one repo to another, this is tracked in #323. I'm closing this issue as a duplicate of #323, please subscribe there to be notified of any (eventual) progress.

@fd0 fd0 added duplicate type: feature suggestion suggesting a new feature labels Sep 13, 2017
@fd0 fd0 closed this as completed Sep 13, 2017
@dionorgua
Copy link
Contributor

Just as small follow-up. I use restic to backup to two different locations (personal cloud storage plus large USB HDD). Cloud storage is used more often (on daily basis, just because it's always ready, no need to plug cables, etc). But periodically (weekly or sometimes once a month) I also backup to external HDD.

There is small hack (there is no guarantee that it'll work forever, but it works now). If you make sure that two repos uses same encryption keys (Just do restic init once and then copy such empty repo to different location) then repositories are 100% compatible and you'll have more chances to recover data in case of repo corruption.

Once you init repositories like this, it'll be possible to 'merge' them by copying everything from one repo to another. But please note, that if you do RESTIC_REPOSITORY=repo1 restic backup /my-data and then RESTIC_REPOSITORY=repo2 restic backup /my-data, after such merge your target repo will be 2xSIZE of single repo (because actual pack files will be stored twice). But repository itself will be ok (passes restic check --read-data, restic snapshots shows snapshots from both repositories). And running restic prune removes duplicate data.

And even more, if repo1 become corrupted (for example you've lost a few of files in data dir), such 'merge' will fix repo (even if repo2 contains no file with same name).

While it's probably not suited for periodic sync, I think that if same data is backed up to multiple locations, it's better to use same keys because this gives more chances to 100% recover whole repo with full history.

@alphapapa
Copy link

@dionorgua That's very interesting, thanks for sharing that. I think this should be documented somewhere in Restic, perhaps in a list of example configurations or use cases.

@mholt
Copy link
Contributor

mholt commented Apr 9, 2018

@alphapapa But if it's documented, then it has to be supported. 😉

Which would be totally awesome -- @fd0, can you confirm whether @dionorgua's 'hack' will work? I ask in combination with reference to #323 -- because if this hack works, that may suffice for the feature request in 323. In fact, maybe 323 could be implemented by using something like that hack, but then a prune or rebuild-index to clean up the duplication? What do you think? (Or am I way off base? 😅 )

@fd0
Copy link
Member

fd0 commented Apr 9, 2018

I can confirm that the hack will work with the current repository design, and as far as I can see everything in @dionorgua's comment is correct, but I won't make this a feature. So, it works, but it may break at some point.

@cipriancraciun
Copy link

@fd0 as of the current version (i.e. 2022, v0.13.1) is this "hack" still working (as in not broken by some recent implementation or design change)? (My assumption is "yes" given restic's backward compatibility, however I would like to be sure. Also, in the future if this "hack" will be broken, could you leave a small comment on this issue?)


My use-case is somewhat similar to what @dionorgua mentions, namely I have two HDD's that I want to rotate while doing backups, but then I want to be able to "merge" them onto a third one. This way there is always at least one good backup offline, and eventually everything gets synchronized together.

My proposed workflow would look like this:

  • let's call the HDD's A, B, and M; (A and B are the ones being rotated, M is the "merging" one;)
  • I've run restic init on M, then copied everything on A and B; (thus everything should be the same, especially the encryption key and chunker seed;)
  • periodically run restic backup targeting either A or B;
  • (obviously from this point on nobody should be touching either A, B or M repositories;)
  • from time-to-time rsync A/ M/ (i.e. A to M); (if a file exists in both A and B, it shouldn't have been modified; thus one could use rsync --ignore-existing to be sure;)
  • run restic prune on M, perhaps with arguments to force repacking;
  • run rsync M/ A/ (i.e. M to A), but this time one could use rsync --delete to remove obsolete files in A;
  • at this moment both A and M should be identical;
  • (the same can be applied for B instead of A;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature suggestion suggesting a new feature
Projects
None yet
Development

No branches or pull requests

7 participants