trouble with upper bounds of memory limits #64

cowboyrushforth · 2016-04-13T19:13:49Z

Hi,

Thanks for group cache, we are using it with almost great success.

One thing I would like to understand more is how to estimate / know the upper bounds of an applications memory needs that's primary functionality is simply to serve things from group cache.

So we have 6GB vm's running group cache, and our first inclination was to set the available cache size to 5gb, leaving 1gb free for the os and things.

Right away this crashed under production load due to OOM killer being invoked. So we ended up turning this number down to just under 2GB so that the memory the process is using stays under the 6GB before OOM killer kicks in.

So the next step was to profile the memory, and low and behold the heap size does not far exceed the 2GB that we have configured in group cache. But the process ends up using 4-6gb of ram.

So the latest attempt was to manually call debug.FreeOSMemory() every couple minutes, and when this is run, shorty after, the amount of memory the process is requiring drops, much is returned to the OS.

However, we still have occasional crashes due to OOM killer. We added SSD swap to buffer this case but after 48 hours of no problems there was a blip (substantial increase) in traffic to these machines causing a single one to get OOM killed, which then snowballed into several others that had this happen.

So, to make this work we could drop the 2GB cache setting (aka 2nd parameter in groupcache.NewGroup) to even lower lip 1GB, but it seems a bit silly to have a 6GB vm that can only use 1gb for cache.

Is this just a downside of using go's approach to memory management for caching?

Not sure if it matters, but our use case is very similar to dl.google.com. We are serving downloads of medium sized files (50mb-1gb, cached in 100MB chunks), and group cache is used to front a slower more expensive api that has the files we need to serve. So naturally when finding this it seems like a great solution to the issue.

We would be extremely grateful for any tips you could share to manage this type of issue. I keep thinking there is something I am missing.

Thanks for any insight you can share.

scott

dvirsky · 2016-04-13T20:39:39Z

This comes mainly from the default setting of GCPrecent to 100%, meaning GC will be triggered automatically only when you have allocated double the amount since the last GC run.

I was in a similar situation, and figured that for a profile of something like Groupcache, it would make more sense to tune it at something like 10%. see https://golang.org/pkg/runtime/debug/#SetGCPercent

cowboyrushforth · 2016-04-13T21:07:03Z

Thanks for your reply. I will try this very soon and report back! Totally makes sense conceptually.

dvirsky · 2016-04-13T21:08:57Z

It was a while ago, but IIRC while the overhead was more than 10%, probably around 20 - it was sane and didn't require twice the memory limit of the cache

dvirsky · 2016-04-14T08:07:22Z

@adg perhaps it's worth mentioning in the README, looks like a common pitfall

cowboyrushforth · 2016-04-14T08:40:07Z

Just to follow up, this has helped tremendously. Thanks again. +1 for adding this to a docs.

adg closed this as completed Apr 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trouble with upper bounds of memory limits #64

trouble with upper bounds of memory limits #64

cowboyrushforth commented Apr 13, 2016

dvirsky commented Apr 13, 2016

cowboyrushforth commented Apr 13, 2016

dvirsky commented Apr 13, 2016

dvirsky commented Apr 14, 2016

cowboyrushforth commented Apr 14, 2016

trouble with upper bounds of memory limits #64

trouble with upper bounds of memory limits #64

Comments

cowboyrushforth commented Apr 13, 2016

dvirsky commented Apr 13, 2016

cowboyrushforth commented Apr 13, 2016

dvirsky commented Apr 13, 2016

dvirsky commented Apr 14, 2016

cowboyrushforth commented Apr 14, 2016