Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary request generator/replayer #307

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

byrnedj
Copy link
Contributor

@byrnedj byrnedj commented Apr 22, 2024

This is the binary trace replayer/generator that we have been using to achieve max CPU utilization for the kvcache traces in cachebench. With this generator, we can achieve a throughput of over 20 million op/sec using kvcache workload in cachebench. As a comparison, using the CSV replay generator we see only ~1.6 million op/sec due to dynamic allocations and parsing overhead.

We avoid allocations by mmap'ing the request data into memory and using a Request pointer to point to the request data rather than allocating a new request wrapper for each request.

To generate a binary request file from an existing kvcache trace (using the "replay" generator).

  1. Specify the kvcache trace name using the regular traceFileNames or traceFileName option. Specify other properties such as ampFactor too.
  2. In the replayGeneratorConfig, specify binaryFileName: "mybinaryfile.bin as a config option
  3. Run cachebench and wait for the binary file to be generated

To run a binary request trace specify the following:

  1. Set generator to "binary-replay"
  2. Set traceFileName: "mybinaryfile.bin" and set ampSizeFactor (if desired)

In summary - this patch offers much lower overhead of trace replaying. It does assumes the kvcache trace format and kvcache replay generator behavior. Additional features:

  • fast forwarding of a trace
  • preloading requests into memory
  • object size amplification
  • queue free for even lower request overhead

The limitations are:

  • no trace amplification (however you can amplify the
    original .csv trace and save it in binary format)
  • ~4GB overhead per 100 million requests
  • you need some disk space to store large traces

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 22, 2024
Copy link
Contributor

@therealgymmy therealgymmy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending this out! A few comments inline. :-)

cachelib/cachebench/cache/Cache.h Outdated Show resolved Hide resolved
cachelib/cachebench/runner/CacheStressor.h Show resolved Hide resolved
Comment on lines 127 to 132
uint32_t ampFactor{1};
uint32_t ampSizeFactor{1};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between ampFactor and ampSizeFactor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment - ampSizeFactor is ampFactor for item size

cachelib/cachebench/util/Config.h Show resolved Hide resolved
cachelib/cachebench/util/Request.h Outdated Show resolved Hide resolved
op = o;
ttlSecs = ttl;
requestId = reqId;
sizeBegin = std::vector<size_t>::iterator(valueSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. what about sizeEnd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

cachelib/cachebench/workload/BinaryKVReplayGenerator.h Outdated Show resolved Hide resolved
uint64_t lines_ = 0;
};

class BinaryFileStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to a separate file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fits really well under the trace file stream class, should both of those go to a new file?

------------------------
This offers much lower overhead of trace replaying. It
assumes the kvcache trace format and kvcache behavoir.
This patch supports the following:
- binary request generation and replay
- fast forwarding of a trace
- preloading requests into memory
- object size amplification
- queue free for even lower request overhead
- can parse many more requests per second than cachelib
  can process, so we can get 100% CPU usage

The limitations are:
- no trace amplification (however you can amplify the
original .csv trace and save it in binary format)
- ~4GB overhead per 100 million requests
- you need some disk space to store large traces

Example usage:

/opt/cachelib/bin/binary_trace_gen --json_test_config opt/cachelib/test_configs/trace_replay/kvcache/config_kvtrace_gen_binary.json --progress 1
/opt/cachelib/bin/cachebench --json_test_config opt/cachelib/test_configs/trace_replay/kvcache/config_kvtrace_replay_binary.json --progress 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants