Why did Hypothesis give a falsifying example, when manually reproducing with numpy arrays does not fail?

Question

I was trying to write my first ultra-simple numpy testcase, but the first thing I thought of seems to hit a roadblock.

So I did this:

import numpy as np
from hypothesis import given
import hypothesis.strategies as hs
import hypothesis.extra.numpy as hxn
    
def tstind(a, i):
     i = max(i, 0)
     i = min(i, len(a)-1)
     return a[i]
     
@given(a=hxn.arrays(dtype=hxn.scalar_dtypes(),
       shape=hxn.array_shapes(max_dims=1)),
       i=hs.integers())
def test_tstind_typeconserve(a, i):
     assert tstind(a, i).dtype == a.dtype
     
test_tstind_typeconserve()

Falsifying example:

test_tstind_typeconserve(
    a=array([0.], dtype=float16), i=0,
)

Error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in test_tstind_typeconserve
  File "/tmp/persistent/miniconda3/envs/hypo/lib/python3.7/site-packages/hypothesis/core.py", line 1169, in wrapped_test
    raise the_error_hypothesis_found
  File "<stdin>", line 5, in test_tstind_typeconserve
AssertionError

But:

a=np.array([0.], dtype=np.float16)
i=0
assert tstind(a, i).dtype == a.dtype

(i.e. OK, does not fail)

BTW the odd case I was expecting it to find is something like this one :

a=np.ma.masked_array([0.], mask=[1], dtype=np.float16)
a.dtype
dtype('float16')
a[0].dtype
dtype('float64')

Zac Hatfield-Dodds · Accepted Answer · 2021-05-25T13:18:11.050

2

Hypothesis is showing you that Numpy datatypes have distinct byte orders. Expanding your test,

    got = tstind(a, i).dtype
    assert got == a.dtype, (a.dtype.byteorder, got.byteorder)

fails for me with AssertionError: ('>', '='). It's unfortunate that the repr of array objects doesn't include the dtype byteorder, but here we are.

(I've reported this as issue 19059, for what it's worth)

edited May 25 '21 at 13:18

answered May 21 '21 at 14:03

Zac Hatfield-Dodds

2,455
6
19

Brilliant catch, thanks !! It strikes me that if the hypothesis 'falsifying example' generation depends on repr, which I guess it must do for generality, then there can be plenty of other distinctions which that would miss (and not just for arrays). Is there any more general way to report on what inputs hypothesis actually used in a failing case ? – pp-mo Jun 04 '21 at 18:02
Hypothesis **does not** depend on the repr - we distinguish inputs based on the sequence of choices used to generate them, and distinguish errors based on the exception type and location (recursively through chained exceptions). – Zac Hatfield-Dodds Jun 06 '21 at 02:30
To see all the examples that Hypothesis generates, [use `verbosity=Verbosity.verbose`](https://hypothesis.readthedocs.io/en/latest/settings.html#hypothesis.settings.verbosity) or pass `pytest -s --hypothesis-verbosity=verbose`. – Zac Hatfield-Dodds Jun 06 '21 at 02:32
"Hypothesis does not depend on the repr" -- yes, I'm sure not. I was only suggesting only the 'falsifying example generation' does : I.E. you can't "specialise" the code output for the array examples so as to include all possible relevant options. If there was a way to specialise it, that would be interesting, though I don't know that I can actually produce a better array.__repr__ than numpy provides. – pp-mo Jun 16 '21 at 17:16
Using verbose mode is instructive. But you still can't see what is critical about this testcase. The information output in the assert does work nicely, though. – pp-mo Jun 16 '21 at 17:17

Why did Hypothesis give a falsifying example, when manually reproducing with numpy arrays does not fail?

1 Answers1