Within the Hypothesis testing library for Python, there is the "assume
" function, which "marks the example as bad, rather than failing the test". If there are too many "bad" examples generated in a row, Hypothesis, by default, will error out that test.
Within Hypothesis, there is a function that can be chained with Strategies named "filter
" that, well, filters out unwanted generated data. If there are too many filtered data items generated in a row, Hypothesis, by default, will error out with the same kind of error as assume
.
From the an example of assume
from the docs:
@given(lists(integers()))
def test_sum_is_positive(xs):
assume(len(xs) > 10)
assume(all(x > 0 for x in xs))
print(xs)
assert sum(xs) > 0
Reimagined with filter
:
@given(
lists(
integers().filter(lambda x: x > 0)
).filter(lambda x: len(x) > 10)
)
def test_sum_is_positive_filter(xs):
print(xs)
assert sum(xs) > 0
Are these essentially the same thing? I understand that the example may be a bit contrived, for educational purposes; they have been shrunk, as it were. If this is the case I just need to have my imagination stretched. (Note: running these two snippets on my Mac Ventura, the first fails a health check and the second one runs)
What is the real, practical difference between these two functions?
When should I use
assume
instead offilter
?
Is there a performance difference between the two?
I searched through the Hypothesis repo to see if assume
and filter
were actually the same thing under the hood, and they seemed not to be.