Best practice for updating a cache entry frequently #42

orcaman · 2015-03-08T15:10:29Z

My question is a bit similar to issue #3.

I have a map that is currently managed in the RAM of the go application on a single instance. I want to share this map between multiple instances for scaling. I am already using consul for discovery of peer instances and I am currently solving this with redis, however I am not happy with the fact that I am not leveraging each machine's RAM (so in that sense I feel that redis is more a DB than a cache). This is one reason why I love groupcache.

I have a constraint though: my map changes all the time (I'm getting requests to update it via http). So for a key K1 in the map, it is likely that m[K1] will be updated very frequently (possibly every one second or less).

So my questions are:

Am I choosing the wrong architecture? Should I use something like Redis or memecached instead?
If groupcache is a good solution for my use case, do I have to constantly remove and add (say in an LRU cache) or is there a smarter way?

Thanks!

dvirsky · 2015-03-08T16:19:43Z

Hey @orcaman. I'm just another Groupcache user, but I hope I can answer your question.

First of all, the two most important things to remember are that Groupcache is a readthrough-only cache, meaning you can't update keys in it. and second, that data is considered immutable and non expiring. It's basically just a distributed LRU. So if you need to update keys frequently, GC is not a good option.

However you can emulate expiration or changing of existing data by key manipulation.

In my case, I needed expiration of about an hour for keys. What I did was add the timestamp of the next round hour to any key I'm trying to get. Thus when an hour passes, this part of the key my app is requesting changes, and from Groupcache's point of view, I'm asking for a new key. The old one will get evicted via the LRU mechanism as no one touches it anymore.

But if you need to constantly write keys, and expire at second resolution, you might be better off using redis or memcache. I'd choose memcache as it's easier to scale with more servers.

orcaman · 2015-03-08T16:24:36Z

Thanks @dvirsky !
I think I get the picture.

dvirsky · 2015-03-08T16:28:50Z

No problem, though I was kinda hoping for more adoption of Groupcache in the Israeli scene. Do you know of any other companies using it? :)

orcaman · 2015-03-08T16:36:53Z

We might still use it, just waiting for the right use case. :-)
I don't know of a lot of gophers in the Israeli scene so I wouldn't know about Groupcache users. Looking forward to meeting you in the upcoming go meetup (if you are coming, that is).

dvirsky · 2015-03-08T16:48:17Z

Cool. We're using it as an outgoing HTTP cache (i.e. cache requests we're making to 3rd parties). We might open this implementation sometime.
Not sure about the Go meetup, hope I'll make it.

dineshappavoo · 2016-04-14T21:40:14Z

@dvirsky and @bradfitz - In my use case I would like to handle TTL [Planning to add the current hour timestamp to the key]. I have the following questions,

1.How does the LRU mechanism removes the key. Can we set minimum duration the key is not used [to mark it as expired]?
2.If not, is this depends on the memory allocated for cache. When the data exceeds the allocated the least recently used will be removed? In that case, how can we handle the memory balance to not to remove the current hour data?

Could you give some suggestions for this?

qbig · 2016-09-14T15:55:35Z

@dvirsky Thanks for your great answer.. However do you see a small spike of CPU&Memory every hour when are the items are "evicted" at the same time ?

dvirsky · 2016-09-14T17:49:54Z

@qbig it depends on how you've implemented the expiration. Here's what I did:

// turn a ttl into an expiration timestamp, using discrete time windows.
// i.e. ask for an hour and get the nearest hour end as the expiration point.
//
// We pad these discrete boundaries  pseudo random margin (based on hashing the key)
// to avoid hitting the cache too hard if all requests expire at once
//
// This can be seconds from now even if you cache for days :)
func calcExpiration(ttl int64, key string, now int64) int64 {
    //we calculate the non discrete expiration, relative to current time
    expires := now

    var padding int64 = 0
    if ttl > 0 {
        // now we want to pad it so we'll no expire all reqeusts for a given time window at once
        // to be consistent, the seed of the padding is a hash on the url
        h := fnv.New32a()
        h.Write([]byte(key))
        padding = int64(h.Sum32()) % ttl

               // not sure this is correct - I wrote it long ago :)
        expires += (ttl - (expires % ttl)) - padding
        if expires < now {
            expires += ttl
        }

    }

    return expires

}

trakhimenok · 2016-09-15T02:22:33Z

@orcaman take a look to https://tarantool.org/

orcaman closed this as completed Mar 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best practice for updating a cache entry frequently #42

Best practice for updating a cache entry frequently #42

orcaman commented Mar 8, 2015

dvirsky commented Mar 8, 2015

orcaman commented Mar 8, 2015

dvirsky commented Mar 8, 2015

orcaman commented Mar 8, 2015

dvirsky commented Mar 8, 2015

dineshappavoo commented Apr 14, 2016

qbig commented Sep 14, 2016

dvirsky commented Sep 14, 2016

trakhimenok commented Sep 15, 2016

Best practice for updating a cache entry frequently #42

Best practice for updating a cache entry frequently #42

Comments

orcaman commented Mar 8, 2015

dvirsky commented Mar 8, 2015

orcaman commented Mar 8, 2015

dvirsky commented Mar 8, 2015

orcaman commented Mar 8, 2015

dvirsky commented Mar 8, 2015

dineshappavoo commented Apr 14, 2016

qbig commented Sep 14, 2016

dvirsky commented Sep 14, 2016

trakhimenok commented Sep 15, 2016