Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prioritization of snapshots during processing (e.g. prune, copy) #1092

Open
simonsan opened this issue Mar 6, 2024 · 0 comments
Open

Prioritization of snapshots during processing (e.g. prune, copy) #1092

simonsan opened this issue Mar 6, 2024 · 0 comments
Labels
S-triage Status: Waiting for a maintainer to triage this issue/PR

Comments

@simonsan
Copy link
Contributor

simonsan commented Mar 6, 2024

Thanks for the question @mprasil.
The answer is: It is not that easy. The reason is, snapshots reference blobs. Those are accessed using the index and finally the pack files they are stored in. So we have a dependency chain:

snapshots -> index -> packs

In order to not save invalid snapshots, rustic must be sure that all dependent pack files (i.e. all packs which contain dependent blobs) and all index files which reference the used blobs are really present in the repository. Therefore we have to wait until they are really saved.

If you just copy one snapshot, rustic will therefore copy all blobs and create index entries, but before saving the snapshot, it will finalize the last already started pack and the last already started index file and save it to the storage before finally saving the snapshot to the storage.

So, in your example copying snapshots one-by-one and always finalizing pack and index files may result in lot of small pack and/or index files compared to the current processing.
(And yes, in theory rustic could still try to prioritize blobs and to detect when all needed blobs and index entries are really written. Then the snapshot could be safely written, too. But this would require a big redesing in how rustic currently works and will take some time to get. It would e.g. also allow to delete pack files earlier during a prune run).

Another issue is the priority of copying. You prefer to have newest snapshots first, but maybe someone else would want to have oldest snapshots first...

However, we could make it optional to process snapshots one-by-one, but then users should be able to also choose their own priority (e.g. old vs new first).

Originally posted by @aawsome in #1090 (comment)

@github-actions github-actions bot added the S-triage Status: Waiting for a maintainer to triage this issue/PR label Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-triage Status: Waiting for a maintainer to triage this issue/PR
Projects
None yet
Development

No branches or pull requests

1 participant