-
Notifications
You must be signed in to change notification settings - Fork 23.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected evictions with LRU policy #2647
Comments
Hello, is this observing just a given key removed while there were other better keys, or something with a statistical significance? Because Redis LRU algorithm is an approximation, so if you have many keys with a very long TTL (let's call them Set "A") and keys with short TTL (set "B") this is what could happen:
Redis LRU approximation samples the DB for 5 random keys, but only gets keys from Set A since they are so much compared to set B. So it picks the best among the sample obtained, and expires a key from A instead of one from B. Is this the case perhaps? If so, this is an expected behavior. This may also happen when you have the same amount of keys in set A and B, but is statistically rare, and should not be too bad for a cache since most of the times, Redis will expire A keys. DBs don't matter too much for eviction, but matters a bit... Keys are removed from each DB one after the other, so the sampling thing I explained above is performed in a local way in each DB, so it will expire a few keys from each DB. If you have Keys A in DB 0, and Keys B in DB 1, then this is why you see keys from Set A expired at the same rate as keys in the Set B. |
p.s. it should be possible to change the algorithm in order to have cross-DBs coherence of LRU. Let's see what is the problem before to go forward... |
Thank you for the fast response. The size of the DBs is indeed very unbalanced:
in my case, the sessions live in DB0, with expire time of 1 year, application cache in DB1, with expire of around 1 day (routes, views etc) and in DB2 is the things that I would like to have removed - oldest first, with expire of 2 hours, with bulk data that gets populated from a cron job. Does this confirm your theory? |
Exactly confirmed... It's not optimal how it works indeed, but I'm not sure we can fix it without impacting the performances a lot. Adding everything to a single DB would fix it BTW. However, are you in the mood to try a few patches in the next days? Thanks. |
Adding them to the same DB would not be good, because we wipe from time to time DB1 or DB2, but not the sessions. I would assume also using different instances would solve the problem? I read somewhere that you wanted to deprecate the multiple DBs feature anyway, should we use a cluster instead? I'm not really sure what would be the equivalent setup. Sure, we can try some patches, I have it rather easily reproducible in our environment. |
Not going to deprecate it in the near future for sure, but it is not a feature of Redis Cluster, so the advice is to use multi DBs wisely. To remove all the keys in a single DB is wise, if you don't have latency sensitive workloads. Otherwise IMHO a better setup is for you to use a single DB, and just use different prefixes for keys of different groups, and mass-remove when needed using SCAN patterns + DEL (variadic) operations. This way you can mass-remove without blocking the server with FLUSHDB. At the same time you solve your eviction problem. I still think its' worth it to experiment with a few patches, but I think that at this point now I'm aware of your use case, your main problem would be FLUSHDB is blocking (it will no longer be blocking in the future, btw. There is an issue about it). |
Blocking with FLUSHDB is not such a big problem, because it happens really rearly, but the eviction problem affects the site really bad, because you cannot login, since the session is stored and gone immediately. Maybe a new algorithm would be a good idea? Since it depends on the use case.. |
Yep, I'll try with the new algo. Let's see if it can be made as fast as the current one... News ASAP. |
In the meantime as my colleague noted about this issue, maybe the @razvanphp is best served using two separated instances until the problem is not investigated and (if possible) solved? |
If I understand this discussion correctly, then LRU purging happens per-db? We're seeing issues where we're using a Redis instance configured as an LRU cache for Rails, and we use separate DBs per stages (production, staging, etc). Keys written to the staging DBs are evicted nearly instantaneously - I suspect because the production DB consumes nearly all the memory. Is there any movement towards making eviction happen across all DBs, or do we need to re-structure the app to write namespaced keys into the same DB? |
@cheald in the long run there is my intention to fix this, however it may require some non trivial change in order to avoid totally trashing the LRU algorithm performances, so I would go in favor of dropping the multiple DBs thing since does not look like a huge change (apparently), in order to make it working well with current Redis. I'm just currently not sure about when/how this will be fixed, but technically, the bug report is accepted and there is willingness to fix it. |
UPDATE: I've a working fix for this, I'm improving it and checking speed regressions. Is a large amount of code changed so not sure if it will make 3.2 or will remain into |
Nice, how is the algorithm going to work now? or is it a totally different (new) algorithm? We were just preparing to split our redis instances because of this, but if it lands in 3.2 we could give it a try. |
Have you considered using an admission policy to account for frequency in addition to recency? Remains O(1) while also providing near optimal hit rates. |
The LRU eviction code used to make local choices: for each DB visited it selected the best key to evict. This was repeated for each DB. However this means that there could be DBs with very frequently accessed keys that are targeted by the LRU algorithm while there were other DBs with many better candidates to expire. This commit attempts to fix this problem for the LRU policy. However the TTL policy is still not fixed by this commit. The TTL policy will be fixed in a successive commit. This is an initial (partial because of TTL policy) fix for issue #2647.
@ben-manes hey, I'm working at LFU right now. You can find a first implementation in the |
@razvanphp The fix for this issue it's a non trivial change to the original LRU algorithm, I've the feeling I'll not merge back into 3.2, but instead try to push for a 4.0 ASAP. |
LFU is optimal for power law, but is less intelligent for many real world workloads. My proposal can be highly compact and achieves near optimal hit rates across a variety of traces. Please take a look at the above link and research paper. |
@ben-manes reading the paper right now. Note that what I implemented supports aging so it's not naive LFU and adapts over time, so it should not be different to what described in the paper if I understand the abstract correctly. However reading it all to check the details. Thanks. |
Another aspect to consider is that the design of the cache proposed in the paper, may be unacceptable for Redis, because of the admission policy concept... While we want Redis to evict things in an intelligent way, I'm not sure it is viable to completely ignore writes. |
The majority of writes are one hit wonders and pollution, which is a cause for LRU degrading. The larger history and optional window avoids ignoring good writes. |
btw, the difference between LFU and TinyLFU (LFU with longer history) is between 0-20% in my traces. The difference is on the large side for search and analytical workloads and smaller (5%+) for database. In those traces LFU is still much better than LRU, as expected, so no matter what you'll have a win in a large cache. Ideally if you can acquire redis traces we can simulate the different options so that you can make an informed design. |
@ben-manes I tried to do data-driven development using the tests you can find in the Btw the LFU algorithm has two tunable parameters, the logarithmic factor, and the decay time, so that it is possible to tune how much aggressively the LFU counter increases with accesses and how much often it is divide by two over time. This should help to experiment with different tuning factors. Every change that can improve the current hit rate, without degrading performances will be very welcome. About remembering the history of keys that are not in the dataset right now, as exploited by the paper you linked, that's an interesting idea, and my LFU implementation does it at least a bit: when a key is replaced with a different value, but having the same name, we make sure to don't forget hits LFU history. However generally applying this concept to Redis is a problem given that the paper you mentioned is geared towards pure GET/SET caches. A Redis user hardly expects that, even when Now instead keys are always created and taken for some time before being evicted, which is a lot less surprising, even if a bit less effective when Redis is purely used as a cache. |
General update for this issue: the cross-DB problems of Redis expire are now fixed into the |
Thanks for the thoughtful response @antirez. Many eviction policies are now segmented to more aggressively evict low value entries. SLRU has a "probation", S4LRU has levels, LIRS has a "HIR" region, ARC has T1, 2Q has an IN queue, etc. These policies are based on recency, promoting from the low value region upon a subsequent access. The more advanced ones use "ghost entries" (evicted keys) to monitor a longer history to try to correct if they evicted an entry prematurely. This allows recency-based policies to capture frequency information. TinyLFU is frequency-based and can be viewed as a straightforward filter. The "admission window" approach (W-TinyLFU) avoids filtering immediately, making it similar to the above policies. In that mode it allows a new entry into the cache and delays the filtering until it no longer has a high recency. The regions can use any eviction policy, e.g. S/LRU in the paper as the most optimal (but random does very well). Using this variation you would not break user expectations by having "no effect at all" for SET operations. I found using synthetic tests (e.g. Zipf) to be less interesting than real-world traces. I used previously published ones, but it would be easy to simulate other workloads if you can acquire some from Redis users. The simulator includes many policies and variants, including memcached's recent adoption of TuQ. We could easily add your LFU variant for comparison and to see its strengths/weaknesses. Let me know if you'd like to collaborate on an analysis. |
Please see the traces in Caffeine #106 which shows a workload where LRU is optimal. This causes policies that are based on recency, but try to infer frequency, to match LRU. However policies that are based on frequency and try to infer recency underperform. The longer frequency histogram in TinyLFU causes it to suffer longer than LFU. The variant with an admission window helps to reduce the impact, but stays at a slight disadvantage. That workload may help you analyze Redis' LFU to see how well it adapts. |
The LRU eviction code used to make local choices: for each DB visited it selected the best key to evict. This was repeated for each DB. However this means that there could be DBs with very frequently accessed keys that are targeted by the LRU algorithm while there were other DBs with many better candidates to expire. This commit attempts to fix this problem for the LRU policy. However the TTL policy is still not fixed by this commit. The TTL policy will be fixed in a successive commit. This is an initial (partial because of TTL policy) fix for issue redis#2647.
Why not implementing WHITE LIST or/and BLACK LIST pattern to prevent or to allow eviction on keys starting with |
The LRU eviction code used to make local choices: for each DB visited it selected the best key to evict. This was repeated for each DB. However this means that there could be DBs with very frequently accessed keys that are targeted by the LRU algorithm while there were other DBs with many better candidates to expire. This commit attempts to fix this problem for the LRU policy. However the TTL policy is still not fixed by this commit. The TTL policy will be fixed in a successive commit. This is an initial (partial because of TTL policy) fix for issue redis#2647.
So this really behaves strange, and we could not understand why after few hours of testing. We have this master redis 3.0.2 with one slave connected, and we are using 3 databases.
The problem is, with
volatile-ttl
,allkeys-lru
orvolatile-lru
, keys with 2 years expire time (PHP sessions) are removed before keys with 2 hours.My understanding would be that those long TTL keys should be removed the last, or I'm missing something? Also a new key added will always be "less used" than an old one that had a few hits, but in my case I don't want to remove those keys. Isn't SET considered "usage"?
Also does the order of the DBs matter for the eviction? I had the feeling that it removes preponderantly from DB 0, but I moved them to the last DB and those objects are still the ones that get removed immediately after write when there is memory pressure.
The text was updated successfully, but these errors were encountered: