New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pathnames depend on backend #654
Comments
Thanks for bringing this up. When we first added the s3 backend, this just slipped in. Probably we'll always have slightly backend-specific filenames. I've thought about adding an option to synchronize (copy) data within backends, #323 mentions this. What do you think about that? |
I agree that synchronizing backend file structure and keeping it in sync would be great. 'Cause what I saw by playing with minio S3 backend, behind all the bells and whistles is just a bunch of files. At some point, not being happy with the speed of my restic/minio setup, I wanted to benchmark the difference between local and S3 backends. That's where I was bitten with these subtle differences, cause trying to use S3 files directly as a local repo soon turned out to be an impossible mission. I had to gave up after half a dozen careful renames, not seeing any progress, and errors just piling up. |
Is it possible that the minio server adds data to the stored files? I'm pretty sure that if this is not the case, you can just use the files and access them locally once they're in the right directory structure. |
A "sync/copy" mode would be a good thing to have at least when there is an actual need for backend-specific file/path names. As far as I understand the existing difference was introduced by accident. I think it would be a good idea to try to keep the structure compatible wherever possible to make handling of files easy. I would suggest to align all existing and new backends to file backends structure and add a compatibility layer, maybe even a "migrator" for existing S3 repos (new ones could be created with the correct/fixed structure). |
I brought REST backend (restic-server) up to date with local backend with this commit. Now, it's possible to access the same repo via restic-server or locally. I'm already seeing interesting things, local restic-server is 26% faster than local backend. Time will tell how this can even be possible :), but for now it looks like this (backup ~6GB/224k files):
There are still lots of improvements I intend to implement in restic-server, you can follow development in this repo: https://github.com/zcalusic/restic-server |
In my opinion it won't be possible to have exactly the same structure/filenames for all backends (think more obscure ones such as a MySQL database, or a DHT). Different backends will always have different requirements, for example the local backend creates sub-directories for data files so that the number of files in a single directory is reduced. I think we should always try to create similar structures where possible. And discuss how we can correct the structure the s3 backend uses without breaking anything. |
Agreed, that's what I meant.
By correct you mean "the structure of the file backend" right?
Option two could be costly for big repos (just a guess) but would not require permanent maintenance of the compatibility layer in option one. |
@fd0, of course this ticket reflects only to file system based backends, currently local, sftp, s3 & rest. Which, by coincidence, is all we have right now (ignoring mem backend, which is only for testing purposes). :) But, having DB backend is a neat idea, I must admit. I had this idea for a long time, to keep backups in PostgreSQL, and then have the database replicated for additional security. Someday, I could even attempt to do that, even if only just to see what will happen. :) @jayme-github, your solution 2) is not costly at all. I just did it with my 67GB S3 repo, as a preparation to put it under restic-server. Basically, it's a few mv/mkdir operations, so I decided to do it from the shell. It took less than 5 minutes (on a low end rotational hard drive), but most of that time was spent execing the mv command from shell on a slow CPU. A simple Go program would be an order or two magnitudes faster, so I guess you could convert even multi TB repositories in a matter of minutes. I'm attaching the shell script below, the repo passed local restic check aftwerwards, so it should be pretty safe. Last few mv/mkdir/rmdir commands are there only to bring data subfolder to a manageable size (it was quite large before that, hashing data folder is definitely important for any repo larger than a few GB). Currently only S3 backend is missing that feature.
|
This aligns the path names generated for S3 backend to the ones used by the file backend allowing S3 objects to be used as file backend and vice versa. Dirname and Filename generation logic have moved from file backend to tha backend package. Added a environment variable (AWS_LEGACY_PATHS) to S3 backend which cat be set to true to switch to legacy pathnames (to be used with existing repositories). Fixes restic#654
We're moving towards unifying the repo layout, this is tracked as #965. I'm closing this issue. |
I was trying to move a file based repository to S3 by copying the files but I'm unable to access it ("unable to open repo: wrong password or no key found")
Looking closer reveals that the file backend uses plural path names ("snapshots", "keys", "locks" probably from
src/restic/backend/paths.go
) while (at least) S3 ans Swift backend uses the singularFileType
(src/restic/file.go
) directly ("snapshot", "key", "lock").It would be cool to have the same path names in every backend so we can move the repos around.
restic check --read-data
finishes successfully when I rename the directories to their singular names and remove the sub directory structure belowdata
:Original
Modified
Output of
restic version
restic 0.3.0 (v0.3.0-16-g011aee1)
compiled at 2016-10-31 08:38:12 with go1.6.2 on linux/amd64
The text was updated successfully, but these errors were encountered: