Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: add to journal when performing sync mtime-and-treehash #58

Open
ihartley-zz opened this issue Dec 12, 2013 · 7 comments
Open

Comments

@ihartley-zz
Copy link

Hi,

I touch'ed a file, and the algorithm works perfectly and matches the hash so doesn't upload it. If I run the command again, it again matches the hash but re-calculates it each time.

I understand the journal is "append only" but it would less CPU intensive if the new "mtime" was included in the journal. e.g. two entries for the same file, with matching data, but the second entry having a later mtime. The software would then use the second (later) mtime for matching - so the hash didn't have to be calculated each time. If I run the backup every day, then every day it will re-calculate the hash for every touch'ed file.

I don't think this violates any consistency with what's in glacier, since the mtime isn't stored anyway.

Hope that makes sense.

Thanks, H.

@vsespb
Copy link
Owner

vsespb commented Dec 12, 2013

I don't think this violates any consistency with what's in glacier, since the mtime isn't stored anyway.

problem that yes, mtime stored on Amazon servers too. Together with filenames.
And there should be ability to drop journal any time and restore it from Amazon Servers without breaking consistency.

@ihartley-zz
Copy link
Author

OK. But I suppose if you re-download the journal then it would check mtime, re-generate hash and then consider equivalent. So it would do the same as it does right now. So I think a second journal entry to update mtime would be OK??

Anyway, just a suggestion. I'm very happy regardless. Thanks.

@vsespb
Copy link
Owner

vsespb commented Dec 12, 2013

Well, if that new, altered mtime will not affect any other logic except mtime+treehash checking that might work.
i.e. if person will try to upload with --detect=mtime original,real mtime should be used. (+ there will be other logic related to mtime)

this can be something, not journal, but kind of mtime cache. which states that filename+mtime=treehash
say, --use-cached-treehash-by-mtime=/path/to/somefile.state

@vsespb vsespb reopened this Dec 12, 2013
@ihartley-zz
Copy link
Author

OK, I think I understand - you want to keep journal pure, just as if download from Glacier.

Personally I don't see an issue with creating a second entry in the journal if mtime is different but hash is same. Worst case is that a new journal needs to be downloaded from Glacier without second entry, and then hash will be re-calculated.

But if you like some local "cache" that would do the same job. I can just see, over time, lots of files with changed mtime where hash is always calculated. Glacier is the best place for GB/TB/PB, and calculating hash for such files takes time..

@vsespb
Copy link
Owner

vsespb commented Dec 12, 2013

I just want make it clear for end users that it's simply cache of filename+mtime=treehash (entry in this cache guarantees that if filename filename has mtime=mtime then we can assume treehash for this file is treehash)

if will affect --detect=mtime-and-treehash, will affect --check-local-hash.

but will not affect --detect=mtime-or-treehash, --detect=mtime (and future file versioning feature based on mtime )

also, if it's in separate file user can just drop it.

@ihartley-zz
Copy link
Author

OK, I like your solution.  If you wish to put it on the feature list, with a low priority, I'll be very happy.  If not, I'll be very happy anyway! :-)

On Thursday, 12 December 2013, 14:07, Victor Efimov notifications@github.com wrote:

I just want make it clear for end users that it's simply cache of filename+mtime=treehash (entry in this cache guarantees that if filename filename has mtime=mtime then we can assume treehash for this file is treehash)

if will affect --detect=mtime-and-treehash, will affect --check-local-hash.
but will not affect --detect=mtime-or-treehash, --detect=mtime (and future file versioning feature based on mtime )
also, if it's in separate file user can just drop it.

Reply to this email directly or view it on GitHub.

@vsespb
Copy link
Owner

vsespb commented Dec 13, 2013

yes will leave this ticket open until implement this. low priority indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants