New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement 'rewrite' command to exclude files from existing snapshots #2731
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good to me 👍
I propose to call the existing code from cmd_forget
to actually remove snapshots. That way also a pruning could be directly started from this command which IMO improves the usage for end users.
cmd/restic/cmd_rewrite.go
Outdated
return true, nil | ||
} | ||
|
||
err = repo.Flush(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You only need to flush the repo when the snapshot was modified.
I also wouldn't flush after finishing a single snapshot, but once after all snapshots are rewritten, i.e. in line 289.
The reason is that flush may finish small packs. If you modify a lot of snapshots generating only few new tree blobs, you may end up in having lots of new packs which each only contain few tree blobs.
However, the snapshot files should only be written after flush finishes with nil error. So also writing the snapshots should be postponed. I suggest returning the new snapshot object here and write all snapshots after Flush
in runRewrite
Thanks a lot for review. I'll try to address everything soon |
I Just mirrored my minio and used this branch to rewrite a few different scenarios. looks good so far. restoring files worked as expected and matched versions restored from original repo. :) The repo is several GB smaller though. I would like to help move this feature forward. |
Hi, |
Is supporting such a feature for a whole repository planned? |
This PR already allows to give multiple snapshots or even multiple snapshot criteria. E.g. you can run it with all snapshots or give multiply We could add a possibility to specify |
Good stuff, thank you. 👍 |
What's the status of this PR? Any chance that it gets merge soon? I'd very much like to use the feature :-D Good work 👍🏼 |
@bertbesser This PR should be considered work in progress. No decision has been made about whether this should be merged or not, the PR is not yet marked as done by the author, and there's a bunch of unaddressed comments. Clearly it's not ready for merging, and the status is simply what you see here in the GitHub GUI where you posted a comment. That said, there's a lot of other stuff going on if you look in this repository, so consider this WIP. Feel free to try it yourself though, just don't consider it production ready (at least not officially). |
Hi, It was working pretty well on my machine in time it was created (but yes, it's very experimental). Good news is that it doesn't touch any data pack files at all. Basically It just adds new snapshot as copy of existing with a few files removed (plus optionally removes old snapshot). I don't think that it can corrupt repository (except new snapshots, that can be just deleted manually) I think that 'rewrite + check + prune' scenario should be safe enough. I've already used it on my own 2.5TB repository without any problems. Removed most of garbage from old snapshots. But I'm not sure that it'll work as is right now, I'll rebase it on top of latest git and try to address most of comments next week. |
This looks like a fantastic feature I sorely miss in Restic. Is there anything I, as a user, can do to help with this PR ? |
Absolutely! I’ve also used it to clean up 30gb of junk from early
backups. I didn’t know how to filter back then. My repo was 40 gb and now
rest consistently under 10 with about 2years of backups. Really helped!!!
But that was also some time ago.
…On Thu, Oct 21, 2021 at 3:26 PM Gaibhne ***@***.***> wrote:
This looks like a fantastic feature I sorely miss in Restic. Is there
anything I, as a user, can do to help with this PR ?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2731 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACPBAZEXMQKIZCY5DGQ4YRTUIBSNNANCNFSM4M6YEN6A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
It's good news that it seems to work fine for those who tried it, and the author's description of how it works suggests it might be not very intrusive and have limited ability to cause any harm if not working as intended. But it's still WIP, needs rebasing, need addressing comments and other polishing, not to mention tests and documentation. Let's conclude that this is the case and unless someone does those parts it's not even close to further review, sorry. |
Hi @dionorgua, is there any chance you'd have an update regarding your work on this PR? I may or may have not oopsied a B2 backup again. 🤠 |
202b8c0
to
3d7d495
Compare
I've rebased the PR and made quite a few cleanups (see the commit messages for more details). The main tasks left are:
|
I'm so thrilled that this PR is getting some progress ! Many thanks to the team members for all the latest commits. I will do some test on my mostly "local" repos. I support @MichaelEischer on the naming for Keep on the great work ! |
@NovacomExperts The |
3d7d495
to
329fa17
Compare
7f4d706
to
915c51d
Compare
I've added enough tests to achieve over 80% test coverage for the new code. There's now also a new documentation section. I'll probably split some of the preparatory refactoring commits into separate PRs to reduce the amount of code a bit.
Instead of using tags, using the printed snapshot id should also work. |
I've added a commit to let the The check is not completely optimal, but it should prevent rewriting for all problematic situations. If the need arises to handle this in a more targeted way, we can still adapt the implementation later on. |
In principle, the JSON format of Tree objects is extensible without requiring a format change. In order to not loose information just play it safe and reject rewriting trees for which we could loose data.
The old check did not consider files containing case insensitive excludes. The check is now implemented as a function of the excludePatternOptions struct to improve cohesion.
72196e5
to
bb0fa76
Compare
LGTM. I've made one final change before merging this PR: when just passing
Thanks for testing! Good to hear that the PR indeed worked as expected (it's been in a good shape for some time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2
Any idea when this will be in the prebuilt binaries? |
It will be in the next release, which is released when it's done. |
@12nick12 You might want to have a look at https://beta.restic.net/ |
FWIW, we're very close to a new release so stay tuned and you'll have it in your hands soon :) But indeed, nothing wrong with using the builds @aawsome pointed to (thanks for pointing that out)! I use them all the time. |
@12nick12 It's out now! See: https://restic.net/blog/2023-01-12/restic-0.15.0-released/ |
What is the purpose of this change? What does it change?
This adds 'rewrite' command that provides a way to remove data from existing snapshots.
This is some sort of preliminary pull requests. It's actually works here on test repo, but lacks tests, etc. Mostly to discuss things
Was the change discussed in an issue or in the forum before?
[ replaces #2720 because it was created from master branch ]
closes #14
Checklist
changelog/unreleased/
that describes the changes for our users (template here)gofmt
on the code in all commits