Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: max size and/or max objects #5

Open
ncw opened this issue Nov 18, 2012 · 12 comments
Open

Feature request: max size and/or max objects #5

ncw opened this issue Nov 18, 2012 · 12 comments
Milestone

Comments

@ncw
Copy link

ncw commented Nov 18, 2012

To make go-cache as useful as memcache it needs an upper limit on the amount of memory that it can use otherwise it is easy to cache too many things and explode the memory usage.

Ideally it would have a memory limit like memcache does. It could be somewhat approximate as I'm sure it isn't particularly easy to account for memory used in the cache.

A max objects limit as well would probably be useful.

@patrickmn
Copy link
Owner

I agree it would be useful, but as it stands it would be fairly difficult as you suggest. go-cache, unlike memcached, doesn't process/serialize most of the data, but rather keeps around a pointer to it. Of course, while this makes go-cache orders of magnitude faster, it also means that it isn't distributed in any way (like memcached is), and, unfortunately, that it can't know the size of what it's storing without some serious reflect/unsafe magic/overhead.

The best solution would probably be to make a new cache type that stores items that satisfy a Sizeable interface where foo.Size() represents its relative size as an integer. This is how Vitess' LRU cache does it: https://code.google.com/p/vitess/source/browse/go/cache/lru_cache.go This would transfer that responsibility to the user. I'm not totally sure if users can easily determine this in many cases, i.e. when they're storing something more complex than e.g. a struct with a []byte field or something, like a map of structs or slices.

Another solution would be to simply limit the number of items in the cache at any given time, but that assumes that every cache item is roughly the same size. The biggest problem with this is choosing the right items to expunge when necessary. Such a cache would probably end up becoming a clone of LRU (i.e. it keeps track of when items were last accessed, and uses something other than/in addition to a map for the storage) where every item has the same size, e.g. 1.

I will keep thinking about the best way to approach this in go-cache, but my gut feeling right now:

  • If you are exhausting memory, perhaps the expiration duration should be shorter?
  • If the expiration duration already is short, and memory usage still explodes, is it safe to assume that there are several frontend/application frontends for this system? In that case, perhaps using dedicated memcached boxes would be preferable? A great benefit here is that you can cache items for a long time, but still reliably expire items from a large cache cluster manually, even if many frontends are using the cache.

@ncw
Copy link
Author

ncw commented Nov 26, 2012

Thanks for the detailed reply and the link to the Vitess cache.

I think you are right in that to do the sizing of objects would require the user to help. I was naively thinking that it could be introspected but you are right, the objects might be pointers to arbitrarily complicated structures which may or may not need to be accounted for in the cache. The user could have a go at estimating the size with a Sizeable interface as you suggested.

I still think a total objects limit would be useful. Let's say you are storing user sessions for 24 hours. However you get on the wrong end of a bot-net which makes a lot of sessions in a very short period of time. That is the kind of scenario that the memory usage explodes. Flushing user sessions wouldn't be ideal in this case but it would keep the server running rather than going out of memory.

Memcached could certainly do the job in this case, but I like the idea of doing it really efficiently with go-cache.

@htoothrot
Copy link

I would like to note that I too believe a total number of objects limit would be useful.

@patrickmn
Copy link
Owner

Sorry for the delay. I am still thinking about how to do this nicely, namely how to evict items the best way without implementing a copy of Vitess (and the overhead of maintaining an extra list of items/refreshing items on Get.) I agree that it's troublesome if somebody explodes your number of items.

@miekg
Copy link

miekg commented May 23, 2016

Doesn't Redis just randomly evict things from the cache? I think I care more about my cache being memory bounded then evicting "the wrong" element. Maybe we can make the actual algo for evicting pluggeable and start with random eviction was does not need fancy book keeping.

@patrickmn
Copy link
Owner

I like the LRU approach best but with that the concern is overhead for read-heavy loads, and the memory usage of the Ctime/Atime for each cached item.

Random eviction could work, but I don't know if there's a reliable way to implement it, given that the underlying storage is map. Map traversal is unpredictable but not truly random, so you can't just traverse, pick the first key, and delete it: https://play.golang.org/p/lUw_1oPTLB But perhaps that is less of a concern for larger maps...

@cognusion
Copy link

Random eviction is fine for lazy caches but when your cache is read-intense, the objects are expensive to generate, and you end up evicting some hot items (and thus immediately cache-miss), it hurts.

@patrickmn patrickmn added this to the 2.1.0 milestone Apr 19, 2017
@patrickmn
Copy link
Owner

patrickmn commented Apr 19, 2017

Interesting timing; this just hit HN: https://danluu.com/2choices-eviction/

We could conceivably have both options, but LRU still seems best given that we don't have a good way of selecting random items from the cache without maintaining some parallel data structure.

@chris-pikul
Copy link

I'm also going to vote for this feature. I'll still use this, but with major caution. A bot-attack would kill my service without the upper limit.

I was going to use this for caching user objects in relation to session keys so I don't have to hit my DB on every request. So expiring time is great for this. I'd be OK with randomly dropping entries if over size. It would be even better if it just dumped the oldest object, but I imagine there'd be a performance hit on that.

@rfyiamcool
Copy link

good idea, Bug I think design limit max memory is better .

@v-antech
Copy link

v-antech commented Jul 7, 2020

This seems like a major missing feature. There are use cases where large eviction times totally make sense, but the cache must be memory bound, otherwise the service will eventually run out of memory and die. Any sort of limitation in that regard is much better than nothing at all IMO. I added this cache to our system with great ease, but will probably try to replace it with another one that supports this vital feature.

@aaomidi
Copy link

aaomidi commented Dec 13, 2021

I've started a PR #146

I'll write up some more benchmarking support for this, but it should be a pretty good way of limiting growth. Since it's a cache, invisibly failing writes is not that big of a deal.

tjvc added a commit to tjvc/gauche that referenced this issue Jun 5, 2022
WIP implementation of a memory limit. This will likely be superseded
by Go's incoming soft memory limit feature (coming August?), but it's
interesting to explore nonetheless.

Each time we receive a PUT request, check the used memory. To calculate
used memory, we use runtime.ReadMemStats. I was concerned that it would
have a large performance cost, because it stops the world on every
invocation, but it turns out that it has previously been optimised.
Return a 500 if this value has exceeded the current max memory. We
use TotalAlloc do determine used memory, because this seemed to be
closest to the container memory usage reported by Docker. This is broken
regardless, because the value does not decrease as we delete keys
(possibly because the store map does not shrink).

If we can work out a constant overhead for the map data structure, we
might be able to compute memory usage based on the size of keys and
values. I think it will be difficult to do this reliably, though. Given
that a new language feature will likely remove the need for this work,
a simple interim solution might be to implement a max number of objects
limit, which provides some value in situations where the user can
predict the size of keys and values.

TODO:

* Make the memory limit configurable by way of an environment variable
* Push the limit checking code down to the put handler

golang/go#48409
golang/go@4a7cf96
patrickmn/go-cache#5
https://github.com/vitessio/vitess/blob/main/go/cache/lru_cache.go
golang/go#20135
https://redis.io/docs/getting-started/faq/#what-happens-if-redis-runs-out-of-memory
https://redis.io/docs/manual/eviction/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants