New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Save symbolic data from *.ktest files as a flat binary file #1681
base: master
Are you sure you want to change the base?
Conversation
Do you think this feature might be useful? |
91e2b67
to
1a09bbe
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #1681 +/- ##
=======================================
Coverage 70.18% 70.19%
=======================================
Files 162 162
Lines 19397 19407 +10
Branches 4632 4635 +3
=======================================
+ Hits 13614 13622 +8
- Misses 3743 3744 +1
- Partials 2040 2041 +1
|
@tkuchta thanks for the contribution and sorry for the delayed answer. We are prioritising PRs for an upcoming KLEE release. The feature is self-contained enough, so we could add it -- do others have a different opinion? |
I think the approach is a bit odd since it dumps all objects into a single file in arbitrary order (e.g. even stat structs, symbolic arguments, ...). In my opinion this should either be a feature of |
@251 @ccadar thank you for your comments! My intention was to dump the concrete values of symbolic objects in the order of object creation, e.g.
that would create a file with the following contents:
The reason for doing that is when we use a fuzzer and substitute I can see two advantages of doing that operation in KLEE:
Thank you again for the comments! |
I have to admit that to me it seems as if this approach is too much to take in "because, why not", but not enough to properly integrate with a fuzzer. |
Hi @danielschemmel , thank you for the comment. Could you possibly let me know what would be missing for the fuzzer integration, please? |
For one, I think that for fuzzer integration it is necessary to specify the output folder (e.g., AFL's queue), instead of just using Secondly, I think there are a few important special cases to consider. The most important one is stdin:
Should ideally yield a buffer file that contains exactly
This means, that there is a non-obvious issue with this example:
|
Thank you for the comment @danielschemmel Indeed, I only considered the file input. I was wondering if stdin would be meaningful in case of the fuzzer; in other words if this is possible to feed AFL with more inputs externally when it assumes getting the data from stdin rather than from a file? Would it also take the data from the queue directory and treat it as stdin? I agree we could add the output files destination and perhaps default it to klee-last. |
Hi @tkuchta,
Symbolic files have the same issue (that's what I meant with "stat structs" above):
Also, the order of objects can be dependent on the external environment and can create non-determinism when "replayed" (or executed via AFL). Of course, for fuzzing this might be less important. I'd also prefer more sophisticated support for fuzzers or would just use a ktest-tool hack + watchexec (or similar) in your case. |
Hi @251,
|
One design that I could imagine here is to somehow™ mark the structure belonging to stdin and symbolic files, and to put them into their own binary files. That would nicely circumvent any non-determinism issues: The variables file contains all custom variables in the order in which they were created (by AFL (and other fuzzers) usually provide a stream of symbolic data on stdin or as a file (path). This most closely maps to the symbolic stdin/symbolic files mechanism in KLEE, which means that we should support it like this for a general fuzzer integration. |
When you rely on the environment:
|
@danielschemmel , @251 Thank you for the comments! |
Summary:
This PR implements a new feature enabled by the
--write-bin-tests
flag.When the flag is used, for every
*.ktest
file, KLEE stores a corresponding*.ktest.buffer
file which contains only the data of the symbolic variables stored as a flat binary file.The use case for these binary files is the following: imagine we want to use KLEE in a hybrid mode with a fuzzer on the same test harness. Currently, if we want to extract data from KLEE and use it with fuzzer, we need to parse
*.ktest
files which contain metadata in addition to the pure symbolic variable contents.The new feature allows to take the
*.ktest.buffer
files and feed them directly to a fuzzer (e.g. AFL++).Checklist: