1

hypothesis allows two different ways to define derived strategies, @composite and flatmap. As far as I can tell the former can do anything the latter can do. However, the implementation of the numpy arrays strategy, speaks of some hidden costs

    # We support passing strategies as arguments for convenience, or at least
    # for legacy reasons, but don't want to pay the perf cost of a composite
    # strategy (i.e. repeated argument handling and validation) when it's not
    # needed.  So we get the best of both worlds by recursing with flatmap,
    # but only when it's actually needed.

which I assume means worse shrinking behavior but I am not sure and I could not find this documented anywhere else. So when should I use @composite, when flatmap and when should I go this halfway route as in the implementation linked above?

Marten
  • 1,336
  • 10
  • 16

1 Answers1

3

@composite and .flatmap are indeed exactly equivalent - anything you can do with one you can also do with the other, and it will have the same performance too.

I actually wrote that comment, and the reason is that we only sometimes want to use a flatmap/composite, but always want to carefully validate our logic. The way I've set it up, we can avoid calling the validators more than once by using .flatmap - which would require a second function definition if we wanted to use @composite.

(there's also an issue of API style in that those arguments are almost always values, but can sometimes be strategies. We now ban such APIs based largely on the confusion arrays() has caused, in favor of letting users write their own .flatmaps)

Zac Hatfield-Dodds
  • 2,455
  • 6
  • 19
  • Makes sense. Can you explain what you mean by "the validators"? Searching for `validator` in the docs only produces two hits, both of which do not really explain anything. And is it accepted best practice not to take strategies as arguments but instead `.flatmap` everything? – Marten Dec 16 '19 at 13:09
  • 1
    Internally, we check that e.g. all the arguments are of an allowed type - but obviously once that validator has passed once, we don't need to run it again for every example. Best practice is that any particular argument can accept a value *xor* a strategy. For example in `st.datetimes()`, `min_value` must always be a datetime (never a strategy), and `timezones` must always be a strategy for `tzinfo` objects (never such an object). Deciding which to use is based on frequency and comparing the inconvenience of `st.just(...)` vs `x.flatmap(...)`. – Zac Hatfield-Dodds Dec 18 '19 at 06:31