-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2 contracts #74
Comments
The approximate api of the builder. type Builder[K comparable, V any] interface {
// It seems better to separate cost and size after all.
// In the experience of ristretto and communication with otter users,
// people do not perceive specifying this with one parameter well.
//
// Also, with this api, we will be able to prohibit the simultaneous use of the MaximumSize and Cost functions.
MaximumSize(maximumSize int) Builder[K, V]
MaximumCost(maximumCost int) Builder[K, V]
CollectStats() Builder[K, V]
InitialCapacity(initialCapacity int) Builder[K, V]
Cost(costFunc func(key K, value V) uint32) Builder[K, V]
DeletionListener(deletionListener func(key K, value V, cause DeletionCause)) Builder[K, V]
// We are slightly reducing the cognitive load on the user.
TTL(ttl time.Duration) Builder[K, V]
// It seems that the use of a variable ttl is almost never required and
// the user should have at least some rule by which he wants to calculate the ttl.
//
// In principle, Set(key, value, ttl) can be transferred to Extension.
//
// It also becomes more difficult to use it when it is not needed. :)
VarTTL(ttlFunc func(key K, value V) uint32) Builder[K, V]
Build() (Cache[K, V], error)
}
func NewBuilder[K comparable, V any]() Builder[K, V] {
...
} |
type Cache[K comparable, V any] interface {
Has(key K) bool
Get(key K) (V, bool)
Set(key K, value V) bool
// the existing value?
// I also want to return the previous value, but then how to take into account the rejection due to the size?
SetIfAbsent(key K, value V) bool
Delete(key K)
DeleteByFunc(f func(key K, value V) bool)
Range(f func(key K, value V) bool)
Clear()
Close()
Size() int
Stats() Stats
Extension() Extension[K, V]
} |
I'd vote for changing terms from TTL is pretty well known term, but what "live" means depends on the implementation. Some use "time since creation" whereas others use "time since last write". Or sometime people will say TTL as the time for the client receiving it, like an http expiry duration header. So I think the acronym could be misinterpreted vs a more generic term like
|
It may be worth stopping returning bool in case of rejections. Then the API will become a little simpler. Still, otter guarantees that it can reject an item only because of the eviction policy, and not as a ristretto because of contention. type Cache[K comparable, V any] interface {
Set(key K, value V)
SetIfAbsent(key K, value V) (V, bool)
} |
It doesn't really matter to me here. It can be replaced with weight.
Oh, that's the truth. I have not yet seen any implementation on Go that uses "expire after last access". Except, goburrow, but it explicitly separates this from "expire after create/last write". But if we separate the different versions by
I need to answer one question. Can these functions be used for a single cache instance? It seems to me that this is absolutely useless, but if there is some kind of use case, then there will be a very unpleasant problem with the expiration policy with a variable ttl. And in fact, a similar problem is very likely to be in type Builder[K comparable, V any] interface {
ExpireAfterCreate(duration time.Duration) Builder[K, V]
ExpireAfterWrite(duration time.Duration) Builder[K, V]
ExpireAfterAccess(duration time.Duration) Builder[K, V]
ExpireAfter(/* what??? */) Builder[K, V]
} It seems that there is a need to use an interface with several methods. But
Ideally, we would avoid using the same interface for all three functions. Could it be something like that? type Builder[K comparable, V any] interface {
ExpireAfter(opts ...ExpiryOption[K, V]) Builder[K, V]
}
type ExpiryOption[K comparable, V any] func(*expiry[K, V]) But this again looks doubtful.
Yes, goroutine is used for the worker. In general, the use of |
They shouldn't be. Guava/Caffeine allow it prior to variable support and naivety as it seemed to have little cost, but I'd stop allowing multiple expiration policies if possible. Usually users seem to set both
Right, I have an expiry interface with create, update, read methods. Ideally it would be on the builder for easiest discovery where all of the options funnel into variable expiration. You can see the discussion in ben-manes/caffeine#1499, as users often made mistakes with the interface and of course it was pretty verbose. If I could start anew then I'd make it convenient and fail at runtime if multiple where selected Caffeine.newBuilder() // user can chose one
.expireAfterCreate(Duration.ofMinutes(5))
.expireAfterCreate((K key, V value) -> duration)
.expireAfterWrite(Duration.ofMinutes(5))
.expireAfterWrite((K key, V value) -> duration)
.expireAfterAccess(Duration.ofMinutes(5))
.expireAfterAccess((K key, V value) -> duration)
.expireAfter(new Expiry...) For backwards compatibility I went with static factories on the interface so they can do one of, .expireAfter(Expiry.creating((Key key, Graph graph) -> duration))
.expireAfter(Expiry.writing((Key key, Graph graph) -> duration))
.expireAfter(Expiry.accessing((Key key, Graph graph) -> duration)) They can define their own
Does that need to be long-lived? I use a state machine to schedule the maintenance worker so only one should be in-flight. This submits a task to an I'd avoid finalizers, as at least in Java they are considered a misfeature and are being deprecated for removal. However java offers other options like weak and phantom references that cover the clean up needs better, without penalizing the GC or quirks like object resurrection. Maybe they're cleaner in Go. |
Great, then this simplifies the situation a bit. .expireAfterCreate(Duration.ofMinutes(5))
.expireAfterCreate((K key, V value) -> duration) There is a small problem. There is no method overloading in Go and it even makes sense. If we can do without What about this?
Actually, no, that's largely why I asked about removing goroutines, but most likely I didn't formulate it very clearly. It would be cool to abandon this, but so far I don't understand the benefits for users from this, unlike adding In fact, finalizers are incredibly often used when working with cgo. Moreover, there are simply no other good ways to clean up resources other than finalizers and |
Yeah, LoadingCache is really complicated. type LoadingCache[K comparable, V any] interface {
...
Get(ctx context.Context, key K) (V, error)
BulkGet(ctx context.Context, keys []K) (map[K]V, error)
// extension?
Extension() LoadingExtension[K, V]
}
type LoadingExtension[K comparable, V any] interface {
GetQuietly(ctx context.Context, key K) (V, error)
GetEntry(ctx context.Context, key K) (Entry[K, V], error)
GetEntryQuietly(ctx context.Context, key K) (Entry[K, V], error)
}
// :(
type SingleLoader[K comparable, V any] interface {
Load(ctx context.Context, key K) (V, error)
}
type BulkLoader[K comparable, V any] interface {
BulkLoad(ctx context.Context, keys []K) (map[K]V, error)
}
type SingleBulkLoader[K comparable, V any] interface {
SingleLoader[K, V]
BulkLoader[K, V]
}
type Builder[K comparable, V any] interface {
// and we are trying to cast to the BulkLoader inside,
// otherwise we use the standard implementation of BulkLoad.
BuildLoading(loader SingleLoader[K, V]) (LoadingCache[K, V], error)
} |
This seems more obvious, not sure about Go idioms if that is a common naming style.
Unfortunately Go api design is outside my expertise. I was able to use Java features to layer it in a way to reduce user complexity, e.g. using default methods. The context wasn't needed since Java typically is paired with dependency injection with request-scoped providers, e.g. to get the logged in user bound to that thread. If not, then the user likely will use a loading |
The main problem is that
Since it changes from request to request, it should also be used in loader. And the idea of casting the interface is actually often used in the standard library. For example, here. I need to think about it, but at least the loss of timeouts, spans and information in the logs look very critical. |
It sounds like func GetIfPresent(key K) (V, bool)
func Get(ctx context.Context, key K, mappingFunction func(K) V) (V, error)
func GetAll(ctx context.Context, keys []K, mappingFunction func([]K) map[K]V) (map[K]V, error) |
if |
To be honest, I didn't really understand :(. Why do I need to specify the loader every time I use the method? type (
LoaderFunc[K comparable, V any] func(ctx context.Context, key K) (V, error)
BulkLoaderFunc[K comparable, V any] func(ctx context.Context, keys []K) (map[K]V, error)
)
type Cache[K comparable, V any] interface {
Has(key K) bool
Get(key K) (V, bool)
BulkGet(keys []K) (map[K]V, bool)
// We are trying to use the loader passed when creating the cache instance,
// if nothing was passed to Load.
// But what if the loader has not been passed anywhere?
Load(ctx context.Context, key K, loader ...LoaderFunc[K, V]) (V, error)
BulkLoad(ctx context.Context, keys []K, bulkLoader ...BulkLoaderFunc[K, V]) (map[K]V, error)
} Or we can even use a more canonical `Option'. Unfortunately, this is one allocation per request. Load(ctx context.Context, key K, opts ...LoadOption[K, V]) (V, error)
BulkLoad(ctx context.Context, keys []K, opts ...LoadOption[K, V]) (map[K]V, error) But what to do with the methods in the It is also interesting what needs to be done to allow the use of explicit ttl. We can hide this in the @proost for me, the methods in For the user, Load only automatically downloads data from an external source if nothing is found. Although with the integration of the API, the line is really blurred. |
Also, if we use WithMaximumSize[K, V](1000) I used the builder to create a more typed cache and add |
When I was trying to make In terms of the API
That is, if the second returned value is false, users will expect the first returned value to be the V already in the cache. However, in case of rejection, there can be no such V in the cache and users have no way to tell. If we want to keep the rejection behavior, I think a slightly complicated API is needed, such as |
I guess, it has something to do with developer experience. Many times, it is hard to prepare the loader function at initialization. Or the loader function may need some conditional dependencies that can't be determined at initialization. |
Since the entry hangs around until GC'd anyway, it didn't seem too awful to me to handle the rejection within the eviction policy logic instead of at the map insertion. The add / update tasks shuffle an entry that is too large for immediate eviction to avoid flushing the cache. Then it is handled however the entry was added (load, put, replace), its linearizable, and simple to understand. I think the only immediate rejection that I do is for AsyncCache when inserting a completed and failed future value, but that's quite minor since it has the callback handler otherwise if it fails after insert. |
@rueian Yes, it seems like a good solution.
This is true, but I was referring specifically to the requirement of specifying a loader. That is, instead of Load(ctx context.Context, key K, loader LoaderFunc[K, V]) (V, error) it seems better to optionally handle loader Load(ctx context.Context, key K, loader ...LoaderFunc[K, V]) (V, error) But it's very likely a special effect from ChatGPT. But something bothers me much more. What should we do if loader was not passed either in the builder and explicitly in the function? It seems easiest to just return some kind of predefined error. |
In the case of cache, err := otter.NewBuilder[int, int]().
MaximumSize(1000).
ExpireAfterWrite(time.Hour).
CollectStats().
Loader(func(ctx context.Context, key K) (V, error) {
...
}).
BulkLoader(func(ctx context.Context, keys []K) (map[K]V, error) {
...
}).
BuildLoading() |
JCache is an alternative standardized Java api which I don't like, but might be a helpful reference. They have a more powerful loader called |
To be honest, it looks overloaded. Actually, I quite like the alternative with options, but I'm still not completely sure about this decision and I want to hear someone else's opinion. Of course, this approach has drawbacks, but in my opinion it looks quite good. The only thing that bothers me a little is the use of
Optiontype GetOption[K comparable, V any] func(*getOptions[K, V])
func WithRecordStats[K comparable, V any](recordStats bool) GetOption[K, V] {
return func(o *getOptions[K, V]) {
o.recordStats = recordStats
}
}
func WithQuietly[K comparable, V any]() GetOption[K, V] {
return func(o *getOptions[K, V]) {
o.isQuietly = true
}
}
func WithLoader[K comparable, V any](loader func(ctx context.Context, key K) (V, error)) GetOption[K, V] {
return func(o *getOptions) {
o.loader = loader
}
}
type SetOption[K comparable, V any] func(*setOptions[K, V])
func WithTTL[K comparable, V any](duration time.Duration) SetOption[K, V] {
return func(so *setOptions[K, V]) {
so.ttl = duration
}
} Cachetype basicCache[K comparable, V any] interface {
Set(key K, value V, opts ...SetOption[K, V])
SetIfAbsent(key K, value V, opts ...SetOption[K, V]) (V, bool)
Delete(key K)
DeleteByFunc(f func(key K, value V) bool)
Range(f func(key K, value V) bool)
Clear()
Size() int
Stats() Stats
}
type Cache[K comparable, V any] interface {
basicCache[K, V]
Has(key K, opts ...GetOption[K, V]) bool
Get(key K, opts ...GetOption[K, V]) (V, bool)
GetEntry(key K, opts ...GetOption[K, V]) (Entry[K, V], bool)
BulkGet(keys []K, opts ...GetOption[K, V]) (map[K]V, bool)
}
type LoadingCache[K comparable, V any] interface {
basicCache[K, V]
Get(ctx context.Context, key K, opts ...GetOption[K, V]) (V, error)
GetEntry(ctx context.Context, key K, opts ...GetOption[K, V]) (Entry[K, V], error)
BulkGet(ctx context.Context, key K, opts ...GetOption[K, V]) (map[K]V, error)
} Buildertype Builder[K comparable, V any] interface {
MaximumSize(maximumSize int) Builder[K, V]
MaximumWeight(maximumWeight int) Builder[K, V]
CollectStats() Builder[K, V]
InitialCapacity(initialCapacity int) Builder[K, V]
Weigher(weigher func(key K, value V) uint32) Builder[K, V]
DeletionListener(deletionListener func(key K, value V, cause DeletionCause)) Builder[K, V]
ExpireAfterCreate(duration time.Duration) Builder[K, V]
ExplicitExpireAfterCreate() Builder[K, V]
ExpireAfterWrite(duration time.Duration) Builder[K, V]
ExplicitExpireAfterWrite() Builder[K, V]
ExpireAfterAccess(duration time.Duration) Builder[K, V]
ExplicitExpireAfterAccess() Builder[K, V]
Loader(loader func(ctx context.Context, key K) (V, error)) Builder[K, V]
BulkLoader(bulkLoader func(ctx context.Context, keys []K) (map[K]V, error)) Builder[K, V]
Build() (Cache[K, V], error)
BuildLoading() (LoadingCache[K, V], error)
} Examples// get quietly
value, ok := cache.Get(key, otter.WithQuietly[K, V]())
// explicit ttl
cache, err := otter.NewBuilder[int, int]().
MaximumSize(1000).
ExplicitExpireAfterWrite().
CollectStats().
Build()
...
cache.Set(key, value, otter.WithTTL[int, int](time.Hour))
// loader
loader := ...
cache.Get(ctx, key, otter.WithLoader(loader)) |
How about letting the loader return TTL as well? I think, in many cases, TTL can only be known when loading the entry. Loader(loader func(ctx context.Context, key K) (V, time.Duration, error)) Builder[K, V]
BulkLoader(bulkLoader func(ctx context.Context, keys []K) (map[K]WithTTL[V], error)) Builder[K, V] |
Oh, I can hardly imagine cases in which the ttl is not known in advance. It seems that in such cases, network requests are not needed and such a function in the builder is enough. type Builder[K comparable, V any] interface {
ExpireAfterCreateFunc(f func(key K, value V) time.Duration) Builder[K, V]
} And when the ttl is returned in the loader, it is unlikely that the user will understand what needs to be returned as ttl there if it is not used. |
Okay, it seems that such an api should be good enough. So I can start developing.) |
Was just reading this thread and saw that you want to go with "Go options" style, but don't want to pay the cost in allocations. You can allocate the options on the stack in basically the same manner and save the allocation. Might fit your use case, might not, but threw up an example for you: Happy coding. |
Haha, it really works. I tried to allocate on the stack and pass a pointer, but it leaked onto the heap and I didn't try to fix the escape analysis decision somehow. Thank you very much! I have almost no free time for otter right now, but I hope there will be more of it later and I will release v2 anyway. :( |
Trick I came up a while ago. Was working on something where I spent so much time getting rid of allocations, allocating the options seemed a bad idea. And I get the time thing, have a lot of that problem. Cheers! |
sturdyc might be a helpful reference. It has loading, request coalescing (“buffering”), fallback caching (“passthrough”), and negative caching ("non-existent") but lacks good eviction/expiration. Their internal inFlight is a future and the missing record is an optional, which is how caffeine lets callers do these. Since that’s not idiomatic in go, their api does seem nice (as a non-go developer). |
Just random thoughts about new contracts...
The text was updated successfully, but these errors were encountered: