Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hypothesis produces poor results #20

Open
TheShiftedBit opened this issue Jul 20, 2021 · 4 comments
Open

Hypothesis produces poor results #20

TheShiftedBit opened this issue Jul 20, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@TheShiftedBit
Copy link
Contributor

TheShiftedBit commented Jul 20, 2021

I'm in the process of writing end-to-end tests to make sure Python coverage is high-quality. In doing so, I discovered that Hypothesis structured fuzzing causes really poor fuzz quality - even the example in the readme doesn't work:

import atheris
from hypothesis import given, strategies as st

@given(st.from_regex(r"\w+!?", fullmatch=True))
@atheris.instrument_func
def test(string):
  assert string != "bad"

atheris.Setup(sys.argv, atheris.instrument_func(test.hypothesis.fuzz_one_input))
atheris.Fuzz()

I checked, and this isn't caused by the new coverage method - this works poorly with old coverage too. Doing this with regular Atheris, however, works excellently.

@Zac-HD, as the original contributor of the Hypothesis examples: do you have any suggestions here? I was thinking something along the lines of an external mutator for libFuzzer might work to fix the issues here. That's how libprotobuf-mutator for C++ works.
@nedwill your input might also be helpful here.

@nedwill
Copy link

nedwill commented Jul 20, 2021

How do mutations work with Hypothesis? I assumed they just did generation, not mutation of existing test cases. This example does show one of the challenges with such an expressive Hypothesis strategy as arbitrary (?) regex. I haven't seen it done before, but intuitively a mutator for regex might involve matching parts of the seed string to different states in the regex FSM and adding/replacing/editing parts of the input without producing a string that won't be accepted. There may already be logic in hypothesis to do this unless they're randomly generating strings and checking if they match, so it may not be too bad.

IMHO, I would just write a mutator for the subset of strategies for which inputs are a simple tree structure and warn the user not to use regex for fuzz testing.

@TheShiftedBit
Copy link
Contributor Author

A note, I'm removing references to Hypothesis from the repo, at least for now - it's really, really bad.

Now that we have more control of the coverage system, I actually plan to revive #5, which might make regexes work way better.

@rmonat
Copy link

rmonat commented Mar 28, 2022

Just to be sure: is this issue applying to all generators created by Hypothesis or the regex-specific one?

I tried the code below and seem to have the same performance issues.

import atheris, sys

from hypothesis import given, strategies as st

@given(st.text())
@atheris.instrument_func
def test(string):
  assert string != "bad"

atheris.Setup(sys.argv, atheris.instrument_func(test.hypothesis.fuzz_one_input))
atheris.Fuzz()

@TheShiftedBit
Copy link
Contributor Author

All generators created by Hypothesis.

Atheris now supports custom mutators, so that might be a better solution.

DavidKorczynski pushed a commit to google/oss-fuzz that referenced this issue Feb 21, 2023
This is an improvement to the urllib3 utils functions fuzzing. This is
my first attempt at contributing to oss-fuzz so feedback appreciated on
whether this custom mutator approach is ok.

- `parse_url` function
We now have a custom mutator that drives more url like structures. I
originally developed this using Hypothesis, but the quality of fuzzing
wasn't good (seems to be a known issue
google/atheris#20). Because of this I've used
a 3rd party library `exrex` that can generate test data from regular
expressions.
eamonnmcmanus pushed a commit to eamonnmcmanus/oss-fuzz that referenced this issue Mar 15, 2023
This is an improvement to the urllib3 utils functions fuzzing. This is
my first attempt at contributing to oss-fuzz so feedback appreciated on
whether this custom mutator approach is ok.

- `parse_url` function
We now have a custom mutator that drives more url like structures. I
originally developed this using Hypothesis, but the quality of fuzzing
wasn't good (seems to be a known issue
google/atheris#20). Because of this I've used
a 3rd party library `exrex` that can generate test data from regular
expressions.
@jvoisin jvoisin added the enhancement New feature or request label Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants