1

I am doing something roughly like this:

test_a.py

import unittest

import hypothesis
import hypothesis.extra.numpy
import numpy as np
from hypothesis import strategies as st

SHAPE = (10, )

ARRAY_STRATEGY = hypothesis.extra.numpy.arrays(float, SHAPE, elements=st.floats(min_value=-1, max_value=1))
ZERO_ONE_STRATEGY = st.floats(min_value=0, max_value=1)


class TestMyClass(unittest.TestCase):
    @hypothesis.settings(max_examples=10)
    @hypothesis.given(
        value=ZERO_ONE_STRATEGY,
        arr1=ARRAY_STRATEGY,
        arr2=ARRAY_STRATEGY,
        arr3=ARRAY_STRATEGY,
    )
    def test_something(self, value: float, arr1: np.ndarray, arr2: np.ndarray, arr3: np.ndarray) -> None:
        print(value)

And I am often seeing behavior like this:

$ pytest test_a.py --capture=no
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0

I am testing behavior with respect to value, so sampling 0.0 10 times renders this test useless to me. I don't need to randomly sample value, in fact, I would like to sample it evenly over the range of ZERO_ONE_STRATEGY at the granularity of 1 / (max_examples - 1). How can I create a strategy to do that?

ringo
  • 978
  • 7
  • 16

1 Answers1

1

Hypothesis does not offer control over the distribution of examples, because computers are better at this than humans.

  • In your example, I'd bet that the other arguments are in fact varying between examples, and it's pretty reasonable to check what happens when value=0.0 and the arrays take on a variety of values.
  • The distribution of inputs which finds bugs fastest-in-expectation is certainly not uniform; nor does it match the distribution of inputs seen in production.
  • We make no promises about the distribution of examples, so that we can improve it in new versions and based on runtime observations of the behavior of code under test. Coverage-guided fuzzing with HypoFuzz or SMT-solving with CrossHair are extreme cases but Hypothesis itself has some feedback mechanisms too (as well as the target() function).
  • We deliberately keep a small amount of redundancy, weighted towards the early examples, to check that your code does the same thing when fed the same inputs.

So my advice overall is to relax, trust the system and research it's based on, and think about the properties to test and possible range of valid inputs rather than the distribution within that range.

Zac Hatfield-Dodds
  • 2,455
  • 6
  • 19