Support for LRU with expire #18

rohitjoshi · 2019-12-13T17:31:58Z

Thanks for the high performance sharedhashfile. Any plans to support LRU with expire similar to Nginx shared cache ?

The text was updated successfully, but these errors were encountered:

simonhf · 2020-01-23T18:05:23Z

Hi Rohit and thanks for the request.

Regarding expire:

sharedhashfile has been used to passive / lazy expire (expire upon read) keys, and this can be achieved by the caller by embedding some kind of timestamp in the value.

Regarding LRU:

This could be achieved by the caller via embedding a double linked list in the value, so that all keys are always in chronological order. Then it'd be easy to find the least recently used key and delete it. But this algorithm would be bad for size and performance. All values would get larger. And there would be likely be a locking performance hit, although perhaps this could be eliminated by limiting the LRU double linked list to an individual SHF shard (containing no locking internally).

Another strategy to achieve something similar but faster purely in caller code is having a separate process -- accessing sharedhashfile in parallel -- which reads through all keys in consecutive order in memory (much faster than regular key access, and works in parallel to regular processes accessing keys) and actively expires keys. I have used a technique like this before to actively expire keys and to persist a set of keys held purely in RAM, to SSD efficiently without compromising in memory key access performance of the other processes.

How would this caller implemented technique work with LRU? As the keys are consecutively accessed in memory, the continually (but likely throttled) scanning process adds the oldest e.g. n% of keys to a list of oldest keys. Note: Only the shorter key UID is added to the list, not the key itself. This list of oldest keys can be used later to purge older keys on demand. However, with this algorithm the absolutely least recently used key is not necessarily purged, only a similarly old least recently used key is purged. This may or may not fit in with your use case.

Hope this helps and looking forward to any questions!

kolinfluence · 2022-11-30T08:50:35Z

This could be achieved by the caller via embedding a double linked list in the value, so that all keys are always in chronological order. Then it'd be easy to find the least recently used key and delete it. But this algorithm would be bad for size and performance. All values would get larger. And there would be likely be a locking performance hit, although perhaps this could be eliminated by limiting the LRU double linked list to an individual SHF shard (containing no locking internally).

actually this is a "better" way to do so. not too sure how the SHF shard works though.

possible to introduce this with golang and i dont mind the lock though.

simonhf · 2022-12-10T20:11:56Z

not too sure how the SHF shard works though

FYI there is some explanation of the internals here [1] including the sharding.

[1] https://github.com/simonhf/sharedhashfile/blob/master/src/shf.h#L95

rohitjoshi changed the title ~~support for LRU with expire~~ Support for LRU with expire Dec 13, 2019

simonhf mentioned this issue May 1, 2020

Persistent storage ? #19

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for LRU with expire #18

Support for LRU with expire #18

rohitjoshi commented Dec 13, 2019

simonhf commented Jan 23, 2020 •

edited

kolinfluence commented Nov 30, 2022

simonhf commented Dec 10, 2022

Support for LRU with expire #18

Support for LRU with expire #18

Comments

rohitjoshi commented Dec 13, 2019

simonhf commented Jan 23, 2020 • edited

kolinfluence commented Nov 30, 2022

simonhf commented Dec 10, 2022

simonhf commented Jan 23, 2020 •

edited