0

I'm trying to create a hypothesis strategy which produces integers with no repeats. Here's my code:

import hypothesis
import hypothesis.strategies as strategies

def unique(strat):
    previous = set()

    @strategies.composite
    def new_strategy(draw):
        while True:
            value = draw(strat)
            if value not in previous:
                previous.add(value)
                return value

    return new_strategy

strategy = unique(strategies.integers(min_value=0, max_value=1000))

@hypothesis.given(num=strategy)
def test_unique(num):
    pass

However, when I run pytest, I get

    @check_function
    def check_strategy(arg, name="")
        if not isinstance(arg, SearchStrategy):
            hint = ""
            if isinstance(arg, (list, tuple)):
                hint = ", such as st.sampled_from({}),".format(name or "...")
            if name:
                name += "="
            raise InvalidArgument(
                "Expected a SearchStrategy%s but got %s%r (type=%s)"
                % (hint, name, arg, type(arg).__name__)
            )
E           hypothesis.errors.InvalidArgument: Expected a SearchStrategy but got mapping['num']=<function accept.<locals>.new_strategy at 0x7f30622418b0> (type=function)
Daniel Walker
  • 6,380
  • 5
  • 22
  • 45
  • Also, how would your strategy handle strats with a finite set of values, such as `hypothesis.strategies.booleans()` or `hypothesis.strategies.integers(0, 5)`? – NicholasM Sep 15 '22 at 20:34
  • @NicholasM, I admit I've thought of this but don't have an answer yet. For my use case, I'll just make sure to not make the sample size too large. – Daniel Walker Sep 15 '22 at 20:36

1 Answers1

0
@st.composite
def unique(draw, strategy):
    seen = draw(st.shared(st.builds(set), key="key-for-unique-elems")))
    return draw(
        strategy
        .filter(lambda x: x not in seen)
        .map(lambda x: seen.add(x) or x)
    )

There's a couple of cute tricks here:

  1. Use st.shared() to create a new seen cache for each distinct example. This fixes your "what if you run out of values" problem, but also fixes your critical "the test can't replay failures" problem which would make the whole thing horribly flaky.
    • For advanced tricks, try using the key= argument to shared, or having it construct a dictionary.
  2. .filter(...) to exclude already-seen items. This is better than a loop because it means Hypothesis can more effectively avoid and report on futile attempts to generate a not-yet-seen example.
  3. The .map(...) call to add it to the set is a blatant abuse of the facts that set.add() returns None after mutating the object, and that or evaluates to the second item if the first is falsey.
Zac Hatfield-Dodds
  • 2,455
  • 6
  • 19