Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trouble with upper bounds of memory limits #64

Closed
cowboyrushforth opened this issue Apr 13, 2016 · 5 comments
Closed

trouble with upper bounds of memory limits #64

cowboyrushforth opened this issue Apr 13, 2016 · 5 comments

Comments

@cowboyrushforth
Copy link

Hi,

Thanks for group cache, we are using it with almost great success.

One thing I would like to understand more is how to estimate / know the upper bounds of an applications memory needs that's primary functionality is simply to serve things from group cache.

So we have 6GB vm's running group cache, and our first inclination was to set the available cache size to 5gb, leaving 1gb free for the os and things.

Right away this crashed under production load due to OOM killer being invoked. So we ended up turning this number down to just under 2GB so that the memory the process is using stays under the 6GB before OOM killer kicks in.

So the next step was to profile the memory, and low and behold the heap size does not far exceed the 2GB that we have configured in group cache. But the process ends up using 4-6gb of ram.

So the latest attempt was to manually call debug.FreeOSMemory() every couple minutes, and when this is run, shorty after, the amount of memory the process is requiring drops, much is returned to the OS.

However, we still have occasional crashes due to OOM killer. We added SSD swap to buffer this case but after 48 hours of no problems there was a blip (substantial increase) in traffic to these machines causing a single one to get OOM killed, which then snowballed into several others that had this happen.

So, to make this work we could drop the 2GB cache setting (aka 2nd parameter in groupcache.NewGroup) to even lower lip 1GB, but it seems a bit silly to have a 6GB vm that can only use 1gb for cache.

Is this just a downside of using go's approach to memory management for caching?

Not sure if it matters, but our use case is very similar to dl.google.com. We are serving downloads of medium sized files (50mb-1gb, cached in 100MB chunks), and group cache is used to front a slower more expensive api that has the files we need to serve. So naturally when finding this it seems like a great solution to the issue.

We would be extremely grateful for any tips you could share to manage this type of issue. I keep thinking there is something I am missing.

Thanks for any insight you can share.

  • scott
@dvirsky
Copy link

dvirsky commented Apr 13, 2016

This comes mainly from the default setting of GCPrecent to 100%, meaning GC will be triggered automatically only when you have allocated double the amount since the last GC run.

I was in a similar situation, and figured that for a profile of something like Groupcache, it would make more sense to tune it at something like 10%. see https://golang.org/pkg/runtime/debug/#SetGCPercent

@cowboyrushforth
Copy link
Author

Thanks for your reply. I will try this very soon and report back! Totally makes sense conceptually.

@dvirsky
Copy link

dvirsky commented Apr 13, 2016

It was a while ago, but IIRC while the overhead was more than 10%, probably around 20 - it was sane and didn't require twice the memory limit of the cache

@adg adg closed this as completed Apr 13, 2016
@dvirsky
Copy link

dvirsky commented Apr 14, 2016

@adg perhaps it's worth mentioning in the README, looks like a common pitfall

@cowboyrushforth
Copy link
Author

Just to follow up, this has helped tremendously. Thanks again. +1 for adding this to a docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants