Questions about trace files when running cachebench #306

rainjuns · 2024-04-15T05:58:43Z

Hello, thank you for managing the great project!

I found that cachelib provides several traces in here and I have two questions in testing them using cachebench.

Do trace files include set operations caused by misses onget operations?

I found that CacheLib has options (enableLookAside) for performing set operations over misses on get operations.
However, I wonder if such behaviors are already captured in trace files.

If there are a lot of get operations before set operations, which can be captured in trace files, is a miss ratio in CacheLib still accurate?

Depending on how trace files are collected, only get operations can be captured and they will cause a lot of misses, increasing a miss ratio.
In addition, if enableLookAside is turned on, many set operations will be generated for the same key-value.
In production level, get operations for the same key might be queued while waiting the response from the first miss trigger.
Please refer to the following trace lines (the first file of kvcache/202206), key: 1665497896 will generate a lot of misses:

key,op,size,op_count,key_size
1668757755,SET,82,1,40
1668757755,GET,0,1,40
1668757805,SET,208,1,63
1668757805,GET,0,1,63
1665498006,GET,104,2,64
1666258101,GET,81,2,23
1665497896,GET,169,18,78
1665702915,SET,109,1,40
1665702915,GET,0,1,40
1665497896,GET,169,18,78

For requests with the same key, what is difference between (1) a trace line with op_count larger than 1 and (2) multiple trace lines with op_count=1?

The text was updated successfully, but these errors were encountered:

therealgymmy · 2024-04-16T17:28:40Z

1). Yes the traces include "SET" which are triggered due to misses to get operations in our systems. There're some exceptions in KV traces. Notably some clients do "SET" first and then "GET" (after some minutes or hours). These clients are basically prefetching data. They're rare in the traces compared to the regular cache set-after-a-miss workloads.

2). enableLookAside should only be used when you filter out all the "SET" operations from the original trace. This is useful when you have a cache size drastically different from the cache config, as it will enable CacheBench to behave like an actual cache instead of just replaying the original set traces. (E.g. original hit rate at 90% would have much fewer sets compared to a smaller cache at 50% but receiving the same GET workload).

3). op_count = the number of requests we have seen for this key in this "second" when we collected the traces originally. Each row in our trace represents a second worth of requests per key per operation.

rainjuns · 2024-04-17T07:04:30Z

@therealgymmy Thank you for the response. I appreciate it!

rainjuns changed the title ~~Questions about trace files~~ Questions about trace files when running cachebench Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about trace files when running cachebench #306

Questions about trace files when running cachebench #306

rainjuns commented Apr 15, 2024 •

edited

therealgymmy commented Apr 16, 2024

rainjuns commented Apr 17, 2024 •

edited

Questions about trace files when running cachebench #306

Questions about trace files when running cachebench #306

Comments

rainjuns commented Apr 15, 2024 • edited

therealgymmy commented Apr 16, 2024

rainjuns commented Apr 17, 2024 • edited

rainjuns commented Apr 15, 2024 •

edited

rainjuns commented Apr 17, 2024 •

edited