6

I really like to use cmcstl2, an implementation of the Ranges TS. I especially like the optional projections on every STL-algorithm. Invocable types get forwarded (ehm... or not) like this: (min_element.hpp)

template <ForwardIterator I, Sentinel<I> S,
    class Comp = less<>, class Proj = identity>
requires
    IndirectStrictWeakOrder<
        Comp, projected<I, Proj>>()
I min_element(I first, S last, Comp comp = Comp{}, Proj proj = Proj{});

template <ForwardRange Rng, class Comp = less<>, class Proj = identity>
requires
    IndirectStrictWeakOrder<
        Comp, projected<iterator_t<Rng>, Proj>>()
safe_iterator_t<Rng>
min_element(Rng&& rng, Comp comp = Comp{}, Proj proj = Proj{})
{
    return __stl2::min_element(__stl2::begin(rng), __stl2::end(rng),
        __stl2::ref(comp), __stl2::ref(proj));
}

As reference: The range-v3 library implements it like this: (min_element.hpp)

struct min_element_fn {
        template<typename I, typename S, typename C = ordered_less, typename P = ident,
            CONCEPT_REQUIRES_(ForwardIterator<I>() && Sentinel<S, I>() &&
                IndirectRelation<C, projected<I, P>>())>
        I operator()(I begin, S end, C pred = C{}, P proj = P{}) const;

        template<typename Rng, typename C = ordered_less, typename P = ident,
            typename I = range_iterator_t<Rng>,
            CONCEPT_REQUIRES_(ForwardRange<Rng>() &&
                IndirectRelation<C, projected<I, P>>())>
        range_safe_iterator_t<Rng> operator()(Rng &&rng, C pred = C{}, P proj = P{}) const
        {
            return (*this)(begin(rng), end(rng), std::move(pred), std::move(proj));
        }
};

Now I try to understand the difference and reasoning of both approaches. Why should I take Invocable types by value anyway? Why should I not use perfect forwarding for these types?

I understand the second approach more than I do for the first one, since I understand the methodology of taking sink arguments by value.

ildjarn
  • 62,044
  • 9
  • 127
  • 211
Maikel
  • 1,204
  • 7
  • 19

2 Answers2

8

Two reasons:

  1. My reading of the standard library specification is that algorithms may copy user function objects as many times as they like, but are specified to perform all invocations on a single instance. Since cmcstl2 frequently implements algorithms in terms of other algorithms, the simplest way to meet that requirement is to internally pass function objects by reference_wrapper. For example, binary_search calls lower_bound and then determines if the element denoted by the lower bound is an exact match. It passes reference_wrappers to the comparison and project function objects to lower_bound so that it can invoke the same instances later.

  2. Large and/or mutable function objects may be rare, but there's no reason they must be poorly supported in the standard library. Copying is usually cheap, and moving almost always is as well, but passing by reference is "never" expensive. cmcstl2 minimizes both copies an moves of user function objects. (The air quotes on "never" are there to indicate that passing by reference puts a substantially heavier load on the optimizer, increasing compile times and potentially generating poor code in corner cases if alias analysis is confused by the function object references.)

There are some obvious holes in this reasoning. Foremost to me is "If function objects may usefully be stateful, shouldn't the algorithms be returning them to preserve that state, as does std::for_each?" The design of cmcstl2 essentially violates what Elements of Programming calls "The Law of Useful Return." Should we complicate the signatures of the standard algorithms to return as many as three function objects - say a comparator and two projections - to accommodate a 0.1% use case? I think the obvious answer here is "no," especially given that the workaround is so simple: pass a reference_wrapper.

So, why should cmcstl2 in general - and Standard C++'s std::for_each in particular - go out of their way to accommodate large and/or mutable function objects when the workaround similarly is to pass a reference_wrapper? It seems the designer of cmcstl2 has made the same mistake here as did LWG when they made std::for_each return its function object.

Casey
  • 41,449
  • 7
  • 95
  • 125
  • 1
    Source: am author of cmcstl2. – Casey Feb 21 '17 at 18:06
  • `reference_wrapper` isn't as easy to spell when you pass a large/mutable temporary. I also don't think 1) matches the reading of other implementers, though it's nice to have. – T.C. Feb 21 '17 at 18:18
  • I didn't expect to get another answer, but I am glad. Esp. the binary search example is really good. – Maikel Feb 21 '17 at 19:00
7

It's traditional to take Invocable by value because they tend to have a small sizeof, such as in a function pointer or a lambda with a few captures. Such function parameters, according to the ABI, are passed in machine registers, or eliminated entirely in the case of less or identity. On the other hand, passing by reference tends to push the compiler to put a real object in RAM.

Larger objects, or ones with significant mutable state, may be passed through std::ref. The resulting std::reference_wrapper is trivially-copyable and as big as a pointer, so it's efficiently passed by value.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • So is it simply a matter of taste whether to use `std::ref` versus `std::move` for these types? Is there a reasoning behind copying a stateful function object? Would it be considered to be a 'side-effect' when altering such a state, whereas the STL got functional roots? – Maikel Feb 21 '17 at 09:50
  • 2
    1. For named objects, yes. But usually the argument is an expression or `{}`. 2. Depends what you mean by stateful. A lambda with `[&]` captures is stateful but immutable. Many captures may degrade performance with pass-by-value (or may not, depending on optimization). A lambda with `mutable` could misbehave if copied when it doesn't expect. 3. Functional philosophy may have helped the STL to ignore self-mutating functors, but also they're just uncommon. Reference semantics are perhaps *too* common in C++; it's a bit unusual for a library to default to values but it's still good design. – Potatoswatter Feb 21 '17 at 10:15
  • 2
    I'm the author of range-v3, and I approve this message. :-) – Eric Niebler Feb 21 '17 at 15:06