Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protobuf format in input files is not user friendly #847

Open
hadi88 opened this issue Sep 6, 2023 · 6 comments
Open

Protobuf format in input files is not user friendly #847

hadi88 opened this issue Sep 6, 2023 · 6 comments

Comments

@hadi88
Copy link
Contributor

hadi88 commented Sep 6, 2023

Protobufs are being serialized and parsed using the Protobuf binary format. This makes it hard for users to look through input files. For example, it's almost impossible to read a crash producing input to understand the crash.

By small changes to mutation/mutator/proto/BuilderMutatorFactory.java, the write and read methods could use protobuf TextFormat utility so that input files have protos in human readable text format.

@fmeum
Copy link
Contributor

fmeum commented Sep 6, 2023

We currently decode the byte array provided by libFuzzer in every execution, so I am a bit worried that switching out the highly optimized binary format for the fully reflection-based text format will regress fuzz test performance. Have you tested this on a representative fuzz test?

The general approach we have been following to make sense of seeds is to rely on the JUnit integration, which allows developers to inspect the fuzz test parameters as proper Java objects simply by running the fuzz test in test mode and setting a break point. We have found that inspecting input files directly creates friction for developers, even if the format is relatively straightforward. But I can see how that would be different in environments where Protobuf is used heavily and the fuzz test accepts only a single Proto parameter.

@hadi88
Copy link
Contributor Author

hadi88 commented Sep 6, 2023

I also haven't tested the performance and I didn't know that the encoding occurs in every iteration. This certainly wouldn't be great.

Thinking out loud... the encoding during fuzzing could be different from the encoding for user-facing objects. But I understand that this may require a major change in the code structure.

@fmeum
Copy link
Contributor

fmeum commented Sep 6, 2023

I also haven't tested the performance and I didn't know that the encoding occurs in every iteration. This certainly wouldn't be great.

We are looking into reusing the in-memory objects if the input bytes haven't been loaded from disk. This does require patching libFuzzer though.

If you can run a simple performance test on a real-world fuzz test, that could provide us with very relevant data.

Thinking out loud... the encoding during fuzzing could be different from the encoding for user-facing objects. But I understand that this may require a major change in the code structure.

I fully agree. Again this would be possible by patching libFuzzer and is certainly something we could consider. It's just that so far we found it more effective to improve the Java debugging experience, which somewhat sidesteps the question of what a human-readable input file should look like.

@hadi88
Copy link
Contributor Author

hadi88 commented Sep 19, 2023

It's a little hard to judge the performance requirements for all fuzz targets. libprotobuf_mutator gives users the option to mutate binary protos or text protos, and the default option is mutating text proto:

https://github.com/google/libprotobuf-mutator/blob/master/src/libfuzzer/libfuzzer_macro.h#L26-L35

I'm not sure about the default option, but would it be possible for Jazzer to have both options?

@fmeum
Copy link
Contributor

fmeum commented Oct 17, 2023

We will look into this and other ways to make the corpus entries easier to handle eventually. We are currently focusing on polishing the JUnit 5 based workflow though, so I can't say yet when we will get to this.

@ghost
Copy link

ghost commented Feb 26, 2024

Hi @hadi88 ! Did you ever get your issue with Jazzer resolved? Just need to understand in detail what you are trying to achieve, and we can give the best options to solve.
Ping me? david[dot]merian [at] code-intelligence[dot]com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants