Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProbabilisticGeneratorGrammarCoverageFuzzer really slow for CSV grammar (>30 times slower than GeneratorGrammarFuzzer) #97

Open
rindPHI opened this issue Mar 11, 2021 · 0 comments

Comments

@rindPHI
Copy link
Collaborator

rindPHI commented Mar 11, 2021

Describe the bug
The ProbabilisticGeneratorGrammarCoverageFuzzer is over 30 times slower than GeneratorGrammarFuzzer when fuzzing 100 inputs from a CSV grammar (without generators or probabilities).

I do not know whether this is just part of the inherent, and unavoidable, complexity of coverage-based fuzzing, but it is really a significant slow-down compared to the standard grammar-based fuzzer. Maybe there is an avoidable bottleneck somewhere, or recent optimizations that can be integrated, and thus also presented to the readers of the fuzzing book?

To Reproduce
Execute the following code. It will demonstrate the relative performance loss.

from time import time

from fuzzingbook.GeneratorGrammarFuzzer import ProbabilisticGeneratorGrammarCoverageFuzzer
from fuzzingbook.GeneratorGrammarFuzzer import GeneratorGrammarFuzzer

grammar = {
    '<start>': ["<csv-record'>"],
    '<csv-record>': ['<csv-string-list>\n'],
    '<csv-string-list>': ['<raw-string>', '<raw-string>;<csv-string-list>'],
    '<raw-string>': ['<spaces>', '<spaces-1><raw-field><spaces-2>'],
    '<raw-field>': ['<simple-field>', '<quoted-field>'], '<simple-field>': ['<simple-character-1>'],
    '<simple-character>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e',
                           'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
                           'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I',
                           'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                           'Y', 'Z', '!', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.',
                           '/', ':', '<', '=', '>', '?', '@', '[', '\\', ']', '^', '_', '`', '{', '|',
                           '}', '~', '\x0b', '\x0c'],
    '<quoted-field>': ['"<escaped-field>"'],
    '<escaped-field>': ['<escaped-character-1>'],
    '<escaped-character>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e',
                            'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
                            'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I',
                            'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                            'Y', 'Z', '!', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.',
                            '/', ':', ';', '<', '=', '>', '?', '@', '[', '\\', ']', '^', '_', '`', '{',
                            '|', '}', '~', ' ', '\t', '\n', '\r', '\x0b', '\x0c'],
    '<spaces>': [' ', ' <spaces>'],
    '<spaces-1>': ['', '<spaces>'],
    '<spaces-2>': ['', '<spaces>'],
    '<simple-character-1>': ['', '<simple-character><simple-character-1>'],
    '<escaped-character-1>': ['', '<escaped-character><escaped-character-1>'],
    "<csv-record'>": ['<csv-record>']}


def fuzz(fuzzer):
    start_time = time()
    for _ in range(100):
        fuzzer.fuzz()
    return time() - start_time


fuzzer = ProbabilisticGeneratorGrammarCoverageFuzzer(grammar)
duration1 = fuzz(fuzzer)
print(f"Duration PGGC-Fuzzer: {round(duration1, 2)}s")

fuzzer = GeneratorGrammarFuzzer(grammar)
duration2 = fuzz(fuzzer)
print(f"Duration GeneratorGrammarFuzzer: {round(duration2, 2)}s")

print(f"Ratio: GeneratorGrammarFuzzer takes {round((duration2 / duration1) * 100, 1)}% of the time of PGGC-Fuzzer")
print(f"Ratio: PGGC-Fuzzer {round((duration1 / duration2), 1)} times slower than GeneratorGrammarFuzzer")

Example Output

Duration PGGC-Fuzzer: 6.4s
Duration GeneratorGrammarFuzzer: 0.19s
Ratio: GeneratorGrammarFuzzer takes 3.0% of the time of PGGC-Fuzzer
Ratio: PGGC-Fuzzer 33.9 times slower than GeneratorGrammarFuzzer

Expected behavior
I expect the ProbabilisticGeneratorGrammarCoverageFuzzer to be slower than GeneratorGrammarFuzzer, but not 40 times.

Desktop (please complete the following information):

  • OS: macOS
  • Python version: 3.9
@rindPHI rindPHI changed the title ProbabilisticGeneratorGrammarCoverageFuzzer really slow for CSV grammar (~40 times slower than GrammarFuzzer) ProbabilisticGeneratorGrammarCoverageFuzzer really slow for CSV grammar (~40 times slower than GeneratorGrammarFuzzer) Mar 12, 2021
@rindPHI rindPHI changed the title ProbabilisticGeneratorGrammarCoverageFuzzer really slow for CSV grammar (~40 times slower than GeneratorGrammarFuzzer) ProbabilisticGeneratorGrammarCoverageFuzzer really slow for CSV grammar (>30 times slower than GeneratorGrammarFuzzer) Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant