-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary request generator/replayer #307
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sending this out! A few comments inline. :-)
cachelib/cachebench/util/Config.h
Outdated
uint32_t ampFactor{1}; | ||
uint32_t ampSizeFactor{1}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between ampFactor and ampSizeFactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment - ampSizeFactor is ampFactor for item size
op = o; | ||
ttlSecs = ttl; | ||
requestId = reqId; | ||
sizeBegin = std::vector<size_t>::iterator(valueSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. what about sizeEnd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
uint64_t lines_ = 0; | ||
}; | ||
|
||
class BinaryFileStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to a separate file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fits really well under the trace file stream class, should both of those go to a new file?
------------------------ This offers much lower overhead of trace replaying. It assumes the kvcache trace format and kvcache behavoir. This patch supports the following: - binary request generation and replay - fast forwarding of a trace - preloading requests into memory - object size amplification - queue free for even lower request overhead - can parse many more requests per second than cachelib can process, so we can get 100% CPU usage The limitations are: - no trace amplification (however you can amplify the original .csv trace and save it in binary format) - ~4GB overhead per 100 million requests - you need some disk space to store large traces Example usage: /opt/cachelib/bin/binary_trace_gen --json_test_config opt/cachelib/test_configs/trace_replay/kvcache/config_kvtrace_gen_binary.json --progress 1 /opt/cachelib/bin/cachebench --json_test_config opt/cachelib/test_configs/trace_replay/kvcache/config_kvtrace_replay_binary.json --progress 1
73b3690
to
f39eb9b
Compare
This is the binary trace replayer/generator that we have been using to achieve max CPU utilization for the kvcache traces in cachebench. With this generator, we can achieve a throughput of over 20 million op/sec using kvcache workload in cachebench. As a comparison, using the CSV replay generator we see only ~1.6 million op/sec due to dynamic allocations and parsing overhead.
We avoid allocations by mmap'ing the request data into memory and using a Request pointer to point to the request data rather than allocating a new request wrapper for each request.
To generate a binary request file from an existing kvcache trace (using the "replay" generator).
traceFileNames
ortraceFileName
option. Specify other properties such as ampFactor too.binaryFileName: "mybinaryfile.bin
as a config optionTo run a binary request trace specify the following:
traceFileName: "mybinaryfile.bin"
and set ampSizeFactor (if desired)In summary - this patch offers much lower overhead of trace replaying. It does assumes the kvcache trace format and kvcache replay generator behavior. Additional features:
The limitations are:
original .csv trace and save it in binary format)