How to build a strategy to create array of tuples with pairs of identical values?

Question

I'd like to generate a strategy for NumPy testing with an output like:

array([[-2, -2],
       [-3, -3],
       [5,  5],
       [-1, -1]], dtype=int16)

What I tried was:

import numpy as np
from hypothesis.strategies import integers
from hypothesis.extra.numpy import arrays
arrays(np.int16, (4,2), elements=integers(-10, 10)).example()

Unfortunately, I can't make values inside tuples identical so the query above returns:

array([[ 5,  5],
       [-7,  5],
       [ 5,  5],
       [ 5,  5]], dtype=int16)

AO_PDX · Accepted Answer · 2020-02-03T17:35:58.393

I have found that if I need to control content within an existing strategy's structure (e.g. pairs of identical values within an array) I need to skip that strategy for lower level ones with which I can build a "ready-made" value that can seed the type I care to generate.

Let's leverage that numpy.array accepts a list of lists to create an array. Let's also assume you want each row to be unique, as your example does not show duplicate rows. If that is not desired, remove the unique_by=str from the depth_strategy definition

Generate an integer and create a list of that value repeated a number of times to meet the WIDTH.
Generate a list of a DEPTH length of the kind of lists we created in the first step.
Combine the two strategies by nesting them.
Feed the result of the third step into numpy.array, making sure the dtype matches the strategy used to generate values in the first step.

# %%
"""Hypothesis strategy for array of tuples with pairs of identical values."""
from hypothesis import given, settings, strategies as st

import numpy as np

WIDTH = 2
DEPTH = 4
MIN_VALUE = -10
MAX_VALUE = 10

# Build the row - Here for clarification only
width_strategy = st.integers(MIN_VALUE, MAX_VALUE).map(
    lambda i: tuple(i for _ in range(WIDTH))
)

# Build the array of rows - Here for clarification only
depth_strategy = st.lists(
    width_strategy, min_size=DEPTH, max_size=DEPTH, unique_by=str
).map(lambda lot: np.array(lot, dtype=np.int64))

# All-in-One
complete_strategy = st.lists(
    st.integers(MIN_VALUE, MAX_VALUE).map(
        lambda i: tuple(i for _ in range(WIDTH))
    ),
    min_size=DEPTH,
    max_size=DEPTH,
    unique_by=str,
).map(lambda lot: np.array(lot, dtype=np.int64))


@settings(max_examples=10)
@given(an_array=complete_strategy)
def create_numpy_array(an_array):
    """Turn list of lists into numpy array."""
    print(f"A numpy array could be:\n{an_array}")


create_numpy_array()

This generates something like:

A numpy array could be:
[[ 3  3]
 [ 9  9]
 [-5 -5]
 [ 0  0]]
A numpy array could be:
[[ 3  3]
 [-2 -2]
 [ 4  4]
 [-5 -5]]
A numpy array could be:
[[ 7  7]
 [ 0  0]
 [-2 -2]
 [-1 -1]]

Note that I set the max_examples to 10 as Hypothesis gives a higher occurrences ratio to values it deems "troublesome", such as zero, NaN, Infinity and such. So example() or a lower number of examples would probably generate a lot of 2x4 arrays of all zeroes. Fortunately the unique_by constraint helps us here.

I just realized that I should have lifted the np.array into the mapping on the outer strategy. I have updated the code above to reflect that. — AO_PDX, Jan 21 '20 at 23:55
Thank you. Eventually, I reused your logic and compose the strategy inside the @composite component. I will reach out to the maintainers of the hypothesis about creating scientific heavy cook-book examples. — Marcin Charęziński, Jan 22 '20 at 01:46
On your note of moving it into an @composite decorated function. I'm still trying to fully grasp the consequences of the [composite v flatmap discussion](https://stackoverflow.com/questions/59342856/composite-vs-flatmap-in-complex-strategies) and how it might apply to strategies like this one. It might be worth a look if you haven't looked at it yet. — AO_PDX, Jan 23 '20 at 19:46
`unique_by=lambda l: str(l)` -> `unique_by=str`. @MarcinCharęziński, we'd be very happy to link to or merge a cookbook, the problem is finding time to write one! — Zac Hatfield-Dodds, Jan 24 '20 at 06:29

score 0 · Answer 2 · answered Jan 20 '20 at 00:50

0

Without looking too much into what np has to offer, you can just generate the tuples using a generator:

tuple_list = [tuple(a) for a in arrays(np.int16, (4,2), elements=integers(-10,10)).example()]

answered Jan 20 '20 at 00:50

Josh Sharkey

1,008
9
34

Thank you for the response but that function returns an array of tuples that does not satisfy conditions stated in the original question. – Marcin Charęziński Jan 20 '20 at 00:54
"How to build a strategy to *create array of tuples* with pairs of identical values?" I'm not sure what exactly you are looking for... – Josh Sharkey Jan 20 '20 at 00:56

score 0 · Answer 3 · answered Jan 20 '20 at 01:50

0

No sure this is what you're after, but the arrays from hypothesis.extra.numpy doesn't appear to have options to duplicate values.

You can just construct the array you need like this:

import numpy as np
from hypothesis.strategies import integers
strat = integers(10, -10)
np.array([[x, x] for x in [strat.example() for _ in range(4)]], np.int16)

Example result:

array([[-9, -9],
       [ 0,  0],
       [-2, -2],
       [ 0,  0]], dtype=int16)

If you don't like that the 2 dimension is baked in, you can have both be parameters like this:

def get_array(rows, cols, strat):
    np.array([[x]*cols for x in [strat.example() for _ in range(rows)]], np.int16)


get_array(4, 2, integers(-10, 10))

answered Jan 20 '20 at 01:50

Grismar

27,561
4
31
54

Note: at this point, you may wonder why use `hypothesis` at all - but you didn't provide a context, so you may want to for specific reasons. – Grismar Jan 20 '20 at 01:55
"arrays from hypothesis.extra.numpy doesn't appear to have options to duplicate values." I feel like there is a way to create such a strategy. Either by build own strategy or augmenting or composing the existing one. – Marcin Charęziński Jan 20 '20 at 02:03

How to build a strategy to create array of tuples with pairs of identical values?

3 Answers3