1

I am trying to generate dictionaries containing different python types as values using the hypothesis module.

For lists I can do this simply using the expression

from hypothesis import given
import hypothesis.strategies as st

@given(
    st.lists(
        st.from_type(type)
            .flatmap(st.from_type)
            .filter(lambda x: not isinstance(x, (type(None)))),
        min_size=2,
        unique_by=lambda x: type(x),
    )
)
def test_something(dictionary):
    ...

which gives me [int, str, ...] (different python type for each entry). But for dictionaries, I there is no unique_by for the values.

@given(
    st.dictionaries(
        st.text(min_size=1, max_size=10),
        st.from_type(type).flatmap(st.from_type)
            .filter(lambda x: not isinstance(x, (type(None), bool))),
        min_size=2,
    )
)
def test_something(dictionary):
    ...

which results in e.g. {'a': int, 'b': int, ...} → the type of value is the same for all entries.

Is there an easy way to generate {'a': int, 'b': str, ..} (at least two different python types in dict.values())?

Azat Ibrakov
  • 9,998
  • 9
  • 38
  • 50
DragonTux
  • 732
  • 10
  • 22

2 Answers2

3

We can reuse your initial approach using the fact that dicts can be built from key-value pairs like

from hypothesis import given, strategies as st


@given(st.builds(zip,
                 st.lists(st.text(min_size=1, max_size=10),
                          min_size=2,
                          unique=True),
                 st.lists(st.from_type(type)
                          .flatmap(st.from_type)
                          .filter(lambda x: not isinstance(x, (type(None)))),
                          min_size=2,
                          unique_by=lambda x: type(x),
                          ))
       .map(dict))
def test_something(dictionary):
    values_types = list(map(type, dictionary.values()))

    assert len(set(values_types)) == len(values_types)
Azat Ibrakov
  • 9,998
  • 9
  • 38
  • 50
  • Thanks a lot for this answer, it is what I need. I completely forgot about creating dictionaries this way - although I only seen it a few days earlier in the hypothesis code :D. I will mark this answer as the correct one as the test stats are better: ``` - 100 passing examples, 0 failing examples, 7 invalid examples (seen up to 25) - Typical runtimes: 1-18 ms (sometimes faster) - Fraction of time spent in data generation: ~ 97% - Stopped because settings.max_examples=100 - Events: * 78.50%, Retried draw from sampled_from ``` – DragonTux Dec 03 '19 at 13:33
2

As Azat Ibrakov mentions, you can build this up from key-value pairs - but it's more efficient to use pairs than to zip two lists togther (because it avoids throwing away some elements if the lists are of different length):

from hypothesis import given, strategies as st


@given(
    st.lists(
        st.tuples(
            st.text(min_size=1, max_size=10),
            st.from_type(type).flatmap(st.from_type).filter(lambda x: x is not None),
        ),
        min_size=2,
        unique_by=(lambda kv: kv[0], lambda kv: type(kv[1])),
    ).map(dict)
)
def test_something(dictionary):
    values_types = list(map(type, dictionary.values()))
    assert len(set(values_types)) == len(values_types)
Zac Hatfield-Dodds
  • 2,455
  • 6
  • 19
  • Thank you for your answer. It seems very equivalent to Azat's in terms of invalid examples (up to 25/30), but runs a bit slower (up to 28 ms: ``` - 100 passing examples, 0 failing examples, 28 invalid examples - Typical runtimes: 0-28 ms - Fraction of time spent in data generation: ~ 98% - Stopped because settings.max_examples=100 - Events: * 64.84%, Retried draw from sampled_from ``` – DragonTux Dec 03 '19 at 13:34
  • 2
    It's running slightly slower because it is much more likely to generate large dictionaries. – Zac Hatfield-Dodds Dec 04 '19 at 03:17