Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache gets dropped #170

Open
karibertils opened this issue Oct 22, 2021 · 18 comments
Open

Cache gets dropped #170

karibertils opened this issue Oct 22, 2021 · 18 comments

Comments

@karibertils
Copy link

karibertils commented Oct 22, 2021

Hello

I have a large library and using warmup takes like 4-6 hours, after it finishes everything is super crazy fast. But after a while it seems like everything gets dropped from the cache, and the speeds go back to how things are without warmup.

Can anything be done to avoid the cache being dropped ?

@hasse69
Copy link
Owner

hasse69 commented Oct 25, 2021

There is nothing that would suddenly just drop the cache. But updates to existing cached directories will invalidate parts of it and depending on what level content is changed this invalidation can be more or less intrusive.

@karibertils
Copy link
Author

The way I'm confirming if they are cached. Is by running find /unrar/section, it takes maybe 5-10sec to go through them all. But after the cache drops the find command pauses on each folder and takes much longer to finish.

Say I have mounted /archives as /unrar

/unrar/section/folder1
/unrar/section/folder2
/unrar/section/folder3
/unrar/section/folder4
...
/unrar/section/folder999

These 999 all contain rar archives and are cached. If folder1 was removed, or folder1000 is added. Would that cause the other folders to drop from cache normally ? I don't think there are any other changes being done.

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2021

Yes, if you change anything in a folder (adding, removing) all it's currently cached information is lost and will have to be refreshed. It is possibly so that the cache invalidation is too aggressive (better-safe-than-sorry approach) but due to other more severe issues currently being investigated this needs to be put on a low priority.

@karibertils
Copy link
Author

I see, it's more aggressive approach than I expected.

No problem, this is obviously low priority that can wait. But thanks for clearing up how it's working currently.

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2021

You need to be aware of the problem we are facing here. The invalidation has no clue exactly what has changed, it only knows that "something" has changed. That means it must be very defensive. If we only allowed changes through the mount point it could be made a bit more clever but since cache can also be invalidated due to external changes it is a lot more work than it might seem. Also having in mind that external changes can never be exactly pin-pointed. External changes are trapped by the modification time stamp having changed, not due to a specific action on the directory.

(Note that the use of the word "never" above is of course not 100% true. It would be possible to conclude exactly what changed but would imply a lot more complexity.)

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2021

What could possibly be an option here is to check if warmup is enabled and if so restart the background task(s). But I need to look at that some other time. Currently I have no time to spare even on the more severe issues I am afraid.

@karibertils
Copy link
Author

I do modifications all the time so the warmup would never stop if it ran auto on modifications. But it would be great if it was possible to restart the warmup background task manually by sending signal to rar2fs.

In my case it's important that changes outside rar2fs are recognized. But after an rar archive has been cached, it might as well be cached permanently based on the filename as key. In my particular case the filenames are always unique strings, but size/timestamp/etc could be added to make it more unique. If an archive with same filename+fields shows up anywhere the decompressed content should always be the same.

Sounds simple on paper, but maybe hard/impossible in practice. Just thinking out loud.

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2021

I am not so sure about your statement about warmup never completing. It would obviously not start from the top root folder (unless that is what changed) but from the directory that was invalidated.

Cache is as unique it can be. The problem with external changes is that what has changed is not known. For that to be possible you would need something like inotify which is not portable and would also consume an enormous amount of resources since only individual directories can be monitored and not an entire sub-tree.

To be able to tell what in fact changed you need to compare cache towards reality and that is basically what warmup would do as well.

A directory cache entry is not a set of sub-entries, it is an entry with a list of nodes. You cannot just remove or add a node to an entry. If an entry is invalidated so is all it's nodes and thus the list has to be refreshed. I do not see any other option here than to have a warmup doing that for you. The alternative to make individual nodes more dynamic is a much more complex undertaking.

@hasse69
Copy link
Owner

hasse69 commented Nov 25, 2021

Another thing that I guess is not obvious is that the directory cache never cache actual external content. It only caches what is located inside RAR archives. So consider the simple case of an external directory A having an archive containing directory B and C. How would you deal with the case that archive is changed/replaced and suddenly has only directory B? The only safe way to deal with this is to invalidate everything sitting under A.
There are so many corner cases to consider and that is what is making all this pretty difficult. The more complexity you add the more corner cases might be overlooked.

@hasse69
Copy link
Owner

hasse69 commented May 6, 2022

Since more users seems interested in this topic, would a beta-patch introducing an option to auto-trigger the warmup have any considerable value?

@milesbenson
Copy link

milesbenson commented May 7, 2022

Yes, i'm willing to test as my mounts get updated frequently

@karibertils
Copy link
Author

Yeah I would like to test it.

@milesbenson
Copy link

Its been a while ;-)
How to refresh the cache manually without having to remount? Just running a ls -R on the mountfolder?

@hasse69
Copy link
Owner

hasse69 commented Mar 7, 2023

Sadly yes, but unless you know exactly from where in the path your cache got dropped I would say it is faster to remount and rely on the warmup.

@hasse69
Copy link
Owner

hasse69 commented Mar 7, 2023

I guess if you feel adventurous a quick hack in the code could be added in which the warmup never ends and simply restart itself over and over.

@milesbenson
Copy link

Well, my cache get dropped when i add new files/folders, so ls -R would be ok. But if you want me to, i can test the other option aswell.

@hasse69
Copy link
Owner

hasse69 commented Mar 7, 2023

The deeper in the path the change is made the less the effect should become. But I think I have mentioned this before somewhere.

@milesbenson
Copy link

It's a simple story:

ARCHIVE/B/B.MOVIE/bmoviefiles.rar
ARCHIVE/D/D.MOVIE/dmoviefiles.rar

When i add a bunch of movies into the lettered subfolders, it can happen the cache gets dropped or needs some love with ls -R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants