Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practice for updating a cache entry frequently #42

Closed
orcaman opened this issue Mar 8, 2015 · 9 comments
Closed

Best practice for updating a cache entry frequently #42

orcaman opened this issue Mar 8, 2015 · 9 comments

Comments

@orcaman
Copy link

orcaman commented Mar 8, 2015

My question is a bit similar to issue #3.

I have a map that is currently managed in the RAM of the go application on a single instance. I want to share this map between multiple instances for scaling. I am already using consul for discovery of peer instances and I am currently solving this with redis, however I am not happy with the fact that I am not leveraging each machine's RAM (so in that sense I feel that redis is more a DB than a cache). This is one reason why I love groupcache.

I have a constraint though: my map changes all the time (I'm getting requests to update it via http). So for a key K1 in the map, it is likely that m[K1] will be updated very frequently (possibly every one second or less).

So my questions are:

  1. Am I choosing the wrong architecture? Should I use something like Redis or memecached instead?
  2. If groupcache is a good solution for my use case, do I have to constantly remove and add (say in an LRU cache) or is there a smarter way?

Thanks!

@dvirsky
Copy link

dvirsky commented Mar 8, 2015

Hey @orcaman. I'm just another Groupcache user, but I hope I can answer your question.

First of all, the two most important things to remember are that Groupcache is a readthrough-only cache, meaning you can't update keys in it. and second, that data is considered immutable and non expiring. It's basically just a distributed LRU. So if you need to update keys frequently, GC is not a good option.

However you can emulate expiration or changing of existing data by key manipulation.

In my case, I needed expiration of about an hour for keys. What I did was add the timestamp of the next round hour to any key I'm trying to get. Thus when an hour passes, this part of the key my app is requesting changes, and from Groupcache's point of view, I'm asking for a new key. The old one will get evicted via the LRU mechanism as no one touches it anymore.

But if you need to constantly write keys, and expire at second resolution, you might be better off using redis or memcache. I'd choose memcache as it's easier to scale with more servers.

@orcaman
Copy link
Author

orcaman commented Mar 8, 2015

Thanks @dvirsky !
I think I get the picture.

@orcaman orcaman closed this as completed Mar 8, 2015
@dvirsky
Copy link

dvirsky commented Mar 8, 2015

No problem, though I was kinda hoping for more adoption of Groupcache in the Israeli scene. Do you know of any other companies using it? :)

@orcaman
Copy link
Author

orcaman commented Mar 8, 2015

We might still use it, just waiting for the right use case. :-)
I don't know of a lot of gophers in the Israeli scene so I wouldn't know about Groupcache users. Looking forward to meeting you in the upcoming go meetup (if you are coming, that is).

@dvirsky
Copy link

dvirsky commented Mar 8, 2015

Cool. We're using it as an outgoing HTTP cache (i.e. cache requests we're making to 3rd parties). We might open this implementation sometime.
Not sure about the Go meetup, hope I'll make it.

@dineshappavoo
Copy link

@dvirsky and @bradfitz - In my use case I would like to handle TTL [Planning to add the current hour timestamp to the key]. I have the following questions,

1.How does the LRU mechanism removes the key. Can we set minimum duration the key is not used [to mark it as expired]?
2.If not, is this depends on the memory allocated for cache. When the data exceeds the allocated the least recently used will be removed? In that case, how can we handle the memory balance to not to remove the current hour data?

Could you give some suggestions for this?

@qbig
Copy link

qbig commented Sep 14, 2016

@dvirsky Thanks for your great answer.. However do you see a small spike of CPU&Memory every hour when are the items are "evicted" at the same time ?

@dvirsky
Copy link

dvirsky commented Sep 14, 2016

@qbig it depends on how you've implemented the expiration. Here's what I did:

// turn a ttl into an expiration timestamp, using discrete time windows.
// i.e. ask for an hour and get the nearest hour end as the expiration point.
//
// We pad these discrete boundaries  pseudo random margin (based on hashing the key)
// to avoid hitting the cache too hard if all requests expire at once
//
// This can be seconds from now even if you cache for days :)
func calcExpiration(ttl int64, key string, now int64) int64 {
    //we calculate the non discrete expiration, relative to current time
    expires := now

    var padding int64 = 0
    if ttl > 0 {
        // now we want to pad it so we'll no expire all reqeusts for a given time window at once
        // to be consistent, the seed of the padding is a hash on the url
        h := fnv.New32a()
        h.Write([]byte(key))
        padding = int64(h.Sum32()) % ttl

               // not sure this is correct - I wrote it long ago :)
        expires += (ttl - (expires % ttl)) - padding
        if expires < now {
            expires += ttl
        }

    }

    return expires

}

@trakhimenok
Copy link

@orcaman take a look to https://tarantool.org/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants