62

Compare

double average = CalculateAverage(values.begin(), values.end());

with

double average = std::for_each(values.begin(), values.end(), CalculateAverage());

What are the benefits of using a functor over a function? Isn't the first a lot easier to read (even before the implementation is added)?

Assume the functor is defined like this:

class CalculateAverage
{
private:
   std::size_t num;
   double sum;
public:

   CalculateAverage() : num (0) , sum (0)
   {
   }

   void operator () (double elem) 
   {
      num++; 
      sum += elem;
   }

   operator double() const
   {
       return sum / num;
   }
};
DanDan
  • 10,462
  • 8
  • 53
  • 69
  • 2
    You need loop over the array by yourself in the first case. Isn't it? – Eric Z Jun 23 '11 at 09:23
  • Will `for_each` really return the average? Don't you need `accumulate`? See http://www.sgi.com/tech/stl/accumulate.html. Here, your second line applies CalculateAvarage()() to each member of the sequence, so you'd need some clever running average calculation, plus an instance of CalculateAverage that you can query after the `for_each`. `for_each` will return a copy of your functor. – juanchopanza Jun 23 '11 at 09:42
  • 2
    Functors give you more flexibility, at the cost of usually using slightly more memory, at the cost of being more difficult to use correctly, and at the cost of some efficiency. The memory cost is minuscule per object, but when it is 100% (as in the case of one function pointer versus double that amount of memory) and you have a zillion objects, it counts. The "use correctly" cost includes that functors can be freely copied, so to share state must use internal pointer and possibly dynamic allocation. And that latter is also main efficiency cost. – Cheers and hth. - Alf Jun 23 '11 at 09:48
  • @juanchopanza: probably OP assumes implicit conversion CalculateAverage to double. And I don't understand how to calculate average (not sum!) with accumulate(). we need to divide by number of elements.. how accumulate's BinaryOperation knows about this number? If BinaryOperation has state and counts sum and number of elements in parallel (and does not use it second operand at all) - is it really more clear solution then for_each ? – user396672 Jun 23 '11 at 12:28
  • @user396672: The way you use accumulate could be for example, `struct Average { double total; uintmax_t count; Average() : total(0), count(0) {} Average operator+(double d) { total += d; count += 1; }; operator double() { return total / count; /* undefined if 0! */ }};`. You don't need to use the binary operator parameter at all if you don't want, and the operator doesn't need to track the count, the accumulator can (and should) do it. I think this is about equally clear with the functor you'd pass to `for_each`, the difference is you implement `operator+` instead of `operator()`. – Steve Jessop Jun 23 '11 at 14:38
  • @Steve Jessop: Initially I consider accumulator approach more complicated and less clear, but I found one theoretical argument in favor of accumulator. If accumulator's BinaryOperation has monoid property (in case of average calculation it has), the operations on the collection may be arbitrary grouped and (theoretically) may be run in parallel (I understand that current implementation of stl algorithms and iterator concept itself are essentially sequential, but accumulator approach, indeed, seems more "functional" and "declarative" ) – user396672 Jun 24 '11 at 08:45

7 Answers7

86

At least four good reasons:

Separation of concerns

In your particular example, the functor-based approach has the advantage of separating the iteration logic from the average-calculation logic. So you can use your functor in other situations (think about all the other algorithms in the STL), and you can use other functors with for_each.

Parameterisation

You can parameterise a functor more easily. So for instance, you could have a CalculateAverageOfPowers functor that takes the average of the squares, or cubes, etc. of your data, which would be written thus:

class CalculateAverageOfPowers
{
public:
    CalculateAverageOfPowers(float p) : acc(0), n(0), p(p) {}
    void operator() (float x) { acc += pow(x, p); n++; }
    float getAverage() const { return acc / n; }
private:
    float acc;
    int   n;
    float p;
};

You could of course do the same thing with a traditional function, but then makes it difficult to use with function pointers, because it has a different prototype to CalculateAverage.

Statefulness

And as functors can be stateful, you could do something like this:

CalculateAverage avg;
avg = std::for_each(dataA.begin(), dataA.end(), avg);
avg = std::for_each(dataB.begin(), dataB.end(), avg);
avg = std::for_each(dataC.begin(), dataC.end(), avg);

to average across a number of different data-sets.

Note that almost all STL algorithms/containers that accept functors require them to be "pure" predicates, i.e. have no observable change in state over time. for_each is a special case in this regard (see e.g. Effective Standard C++ Library - for_each vs. transform).

Performance

Functors can often be inlined by the compiler (the STL is a bunch of templates, after all). Whilst the same is theoretically true of functions, compilers typically won't inline through a function pointer. The canonical example is to compare std::sort vs qsort; the STL version is often 5-10x faster, assuming the comparison predicate itself is simple.

Summary

Of course, it's possible to emulate the first three with traditional functions and pointers, but it becomes a great deal simpler with functors.

Roope Hakulinen
  • 7,326
  • 4
  • 43
  • 66
Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • The `acc` in your example, is that a global variable? – Cheers and hth. - Alf Jun 23 '11 at 09:44
  • You're welcome, but I was trying to give a hint about something a little bigger, which affects your answer fundamentally (I think). I just tried your code, and Visual C++ protests `operator =' function is unavailable in 'CalculateAverageOfPowers'`. g++ similarly complains `error: non-static const member 'const float CalculateAverageOfPowers::p', can't use default assignment operator` – Cheers and hth. - Alf Jun 23 '11 at 09:58
  • @Oli: last time I tested `qsort` it allocated extra memory, so this quite affects performance. Furthermore, it requires a table of pointers, thus has lower locality.... It would be better to compare `std::sort` with predicate with `std::sort` with function pointer, to make your point. – Matthieu M. Jun 23 '11 at 10:07
  • 1
    @Matthieu: That's a fair point. To be honest, I'm echoing the sentiment that Scott Meyers makes in *"Effective STL"*, and that I've observed in practice. I'll look into profiling `std::sort` with functor vs. function-pointer. But fundamentally, I don't think there's any reason that `qsort` couldn't be implemented in the same way as `std::sort` with a function pointer. – Oliver Charlesworth Jun 23 '11 at 10:12
  • Now it's technically OK. I just didn't see your assignments! He he, I had to check back that they'd been there all the time... But it doesn't calculate average, it just accumulates. I gather that a little name change fixes that. :-) – Cheers and hth. - Alf Jun 23 '11 at 10:19
  • @Alf: Sure, yes, you'd also need to add a "getAverage()" function. Let me add the prototype for that. – Oliver Charlesworth Jun 23 '11 at 10:20
  • 1
    @Matthieu: "last time I tested qsort it allocated extra memory" - I'm surprised, I'd expect there to be adequate space on the stack for qsort, and for pretty much any qsort implementation to rely on this. Furthermore, malloc is allowed to fail and qsort isn't, so typically it would have to abort as a special-case. Finally, qsort does not require a table of pointers, the element size is one of its parameters and pointers to the elements are passed to the comparator function. You can qsort an array of anything and expect the same locality as other similar array operations. – Steve Jessop Jun 23 '11 at 10:55
  • 1
    The statefulness example is really good, I hadn't thought of using functors like that. – DanDan Jun 23 '11 at 11:45
  • Beware that it is slightly questionable whether it's valid to rely on the state of functors in this way. See http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-active.html#92, and http://stackoverflow.com/questions/4045228/c-for-each-and-object-functions/4045483#4045483. It's pretty difficult to imagine an implementation of `for_each` that breaks this, but the example for `remove_if` given in the defect report is more "dangerous". `for_each` is a bit under-specified, basically. It's probably "obvious" what the standard intends, but that's not what it says. For other algorithms it's less "obvious". – Steve Jessop Jun 23 '11 at 13:39
  • So, `std::accumulate` is useful precisely because it is carefully defined exactly how the state you're interested in is propagated. C++0x also changes the definition of `for_each` to define how the state propagates. – Steve Jessop Jun 23 '11 at 13:40
  • @Steve: If this (http://drdobbs.com/184403769) and this (http://www.sgi.com/tech/stl/for_each.html) are to be believed, then `for_each` is perfectly safe within stateful functors. For other algorithms, I agree, the functors must be pure. The SO answer you linked to is, unfortunately, nonsense. The whole reason `for_each` returns a copy of the functor is so that you can get at the modified state. – Oliver Charlesworth Jun 23 '11 at 13:48
  • @Oli: the comments under the answer go into more detail. The issue isn't what `for_each` is *supposed* to do, it's what the standard *actually says* it does, which is why the `remove_if` equivalent was proposed as a defect in the standard and the conclusion was to interpret strictly, i.e. no state. SGI/STL predates the standard, just because it says something doesn't make it true of standard C++. Likewise Dobb's can say what it likes about intent, but if Kreft/Langer and Josuttis disagree what the standard says, then in my mind that makes the issue "questionable". – Steve Jessop Jun 23 '11 at 14:02
  • Granted, Josuttis doesn't mention `for_each` in the DR, and `remove_if` doesn't return the predicate. But I think it's true that the language defining `for_each` in C++03 is similarly vague. – Steve Jessop Jun 23 '11 at 14:05
  • @Steve: I don't see where that WG issue talks about `for_each`. `remove_if` is a completely different case; it takes a predicate, which by (moral) definition is pure, so copying/order doesn't matter. As far as I can see, the standard is quite clear about the order and the copying (or the lack thereof) involved in `for_each`... – Oliver Charlesworth Jun 23 '11 at 14:08
  • @Oli: I don't think it is clear, it says "returns f". Clearly this means a copy of f (since it's return-by-value), and it doesn't say when that copy is made. We know what it should say, of course, I'm only criticising what it does say. The DR mentions algorithms in general as well as the `remove_if` / predicate example in particular, but only solves the latter. Predicates may be morally pure, but the C++ neglects to require them to be. It's the non-specification of copies that makes the code example "the user's fault", not solely the impurity of the predicate. – Steve Jessop Jun 23 '11 at 14:16
  • But anyway, regardless of the fact that for_each is OK in practice, and in the intent of the standard, and perhaps also in the language of the standard if you don't accept my criticism of that language, it's still the case that one should "beware". Don't see this use of for_each and think you can go around sticking functors with mutable state everywhere, and don't assume anything about the order of copies unless it's stated in the standard. I believe with `for_each` it is (accidentally) not stated, with remove_if it is (deliberately) not stated. YMMV. – Steve Jessop Jun 23 '11 at 14:28
  • @Steve: I absolutely agree with your point about not creating stateful functors everywhere. `for_each` is (almost?) unique in that respect within the STL. Going back to the standard, IMHO, I'm not sure how it can be interpreted in any way other than as directly mapping to e.g. the pseudo-code at http://www.cplusplus.com/reference/algorithm/for_each/. – Oliver Charlesworth Jun 23 '11 at 14:34
  • @Oli: clearly that's the sensible interpretation of the standard, and as I said to Jerry back in the comments to that answer you didn't like, I was therefore surprised to discover that I can't find anything in the standard to forbid `Function g = f; for ( ; first!=last; ++first ) f(*first); return g;`. It returns a copy of f, which is the only thing that "returns f" can possibly mean in the spec of for_each, and the significance of Josuttis' issue report is that for any algorithm (not just those with predicates), we are not permitted to assume anything about order of copies that is not stated. – Steve Jessop Jun 23 '11 at 14:46
  • But this is a very hair-splitting claim of mine about something that doesn't matter in practice. No implementation does that, or not one we'd want to use. It's fixed in C++0x: for_each returns `std::move(f)`, so clearly the move must be done last since it invalidates f. So possibly I should not have mentioned for_each at all as an example of why to beware. I only came in because the questioner said, "I hadn't thought of using functors like that", and you *can't* in general use functors like that, although you can (in practice) with `for_each`. `remove_if` is an undisputed illustration why not. – Steve Jessop Jun 23 '11 at 14:54
  • As for any concerns of performance, for 1e8 iterations of the two examples given in this thread, they both performed exactly the same (**disclaimer:** with -O3 and same translation unit): `Functor completed in 8.665 seconds | Function completed in 8.661 seconds`. Ref: http://www.pastie.org/pastes/2465298 – hiddensunset4 Sep 01 '11 at 12:16
10

Advantages of Functors:

  • Unlike Functions Functor can have state.
  • Functor fits into OOP paradigm as compared to functions.
  • Functor often may be inlined unlike Function pointers
  • Functor doesn't require vtable and runtime dispatching, and hence more efficient in most cases.
Alok Save
  • 202,538
  • 53
  • 430
  • 533
9

std::for_each is easily the most capricious and least useful of the standard algorithms. It's just a nice wrapper for a loop. However, even it has advantages.

Consider what your first version of CalculateAverage must look like. It will have a loop over the iterators, and then do stuff with each element. What happens if you write that loop incorrectly? Oops; there's a compiler or runtime error. The second version can never have such errors. Yes, it's not a lot of code, but why do we have to write loops so often? Why not just once?

Now, consider real algorithms; the ones that actually do work. Do you want to write std::sort? Or std::find? Or std::nth_element? Do you even know how to implement it in the most efficient way possible? How many times do you want to implement these complex algorithms?

As for ease of reading, that's in the eyes of the beholder. As I said, std::for_each is hardly the first choice for algorithms (especially with C++0x's range-based for syntax). But if you're talking about real algorithms, they're very readable; std::sort sorts a list. Some of the more obscure ones like std::nth_element won't be as familiar, but you can always look it up in your handy C++ reference.

And even std::for_each is perfectly readable once you use Lambda's in C++0x.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
3

•Unlike Functions Functor can have state.

This is very interesting because std::binary_function, std::less and std::equal_to has a template for an operator() that is const. But what if you wanted to print a debug message with the current call count for that object, how would you do it?

Here is template for std::equal_to:

struct equal_to : public binary_function<_Tp, _Tp, bool>
{
  bool
  operator()(const _Tp& __x, const _Tp& __y) const
  { return __x == __y; }
};

I can think of 3 ways to allow the operator() to be const, and yet change a member variable. But what is the best way? Take this example:

#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cassert>  // assert() MACRO

// functor for comparing two integer's, the quotient when integer division by 10.
// So 50..59 are same, and 60..69 are same.
// Used by std::sort()

struct lessThanByTen: public std::less<int>
{
private:
    // data members
    int count;  // nr of times operator() was called

public:
    // default CTOR sets count to 0
    lessThanByTen() :
        count(0)
    {
    }


    // @override the bool operator() in std::less<int> which simply compares two integers
    bool operator() ( const int& arg1, const int& arg2) const
    {
        // this won't compile, because a const method cannot change a member variable (count)
//      ++count;


        // Solution 1. this trick allows the const method to change a member variable
        ++(*(int*)&count);

        // Solution 2. this trick also fools the compilers, but is a lot uglier to decipher
        ++(*(const_cast<int*>(&count)));

        // Solution 3. a third way to do same thing:
        {
        // first, stack copy gets bumped count member variable
        int incCount = count+1;

        const int *iptr = &count;

        // this is now the same as ++count
        *(const_cast<int*>(iptr)) = incCount;
        }

        std::cout << "DEBUG: operator() called " << count << " times.\n";

        return (arg1/10) < (arg2/10);
    }
};

void test1();
void printArray( const std::string msg, const int nums[], const size_t ASIZE);

int main()
{
    test1();
    return 0;
}

void test1()
{
    // unsorted numbers
    int inums[] = {33, 20, 10, 21, 30, 31, 32, 22, };

    printArray( "BEFORE SORT", inums, 8 );

    // sort by quotient of integer division by 10
    std::sort( inums, inums+8, lessThanByTen() );

    printArray( "AFTER  SORT", inums, 8 );

}

//! @param msg can be "this is a const string" or a std::string because of implicit string(const char *) conversion.
//! print "msg: 1,2,3,...N", where 1..8 are numbers in nums[] array

void printArray( const std::string msg, const int nums[], const size_t ASIZE)
{
    std::cout << msg << ": ";
    for (size_t inx = 0; inx < ASIZE; ++inx)
    {
        if (inx > 0)
            std::cout << ",";
        std::cout << nums[inx];
    }
    std::cout << "\n";
}

Because all 3 solutions are compiled in, it increments count by 3. Here's the output:

gcc -g -c Main9.cpp
gcc -g Main9.o -o Main9 -lstdc++
./Main9
BEFORE SORT: 33,20,10,21,30,31,32,22
DEBUG: operator() called 3 times.
DEBUG: operator() called 6 times.
DEBUG: operator() called 9 times.
DEBUG: operator() called 12 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 12 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 18 times.
DEBUG: operator() called 18 times.
DEBUG: operator() called 21 times.
DEBUG: operator() called 21 times.
DEBUG: operator() called 24 times.
DEBUG: operator() called 27 times.
DEBUG: operator() called 30 times.
DEBUG: operator() called 33 times.
DEBUG: operator() called 36 times.
AFTER  SORT: 10,20,21,22,33,30,31,32
joe
  • 329
  • 3
  • 4
2

In the first approach the iteration code has to be duplicated in all functions that wants to do something with the collection. The second approach hide the details of iteration.

Vijay Mathew
  • 26,737
  • 4
  • 62
  • 93
1

OOP is keyword here.

http://www.newty.de/fpt/functor.html:

4.1 What are Functors ?

Functors are functions with a state. In C++ you can realize them as a class with one or more private members to store the state and with an overloaded operator () to execute the function. Functors can encapsulate C and C++ function pointers employing the concepts templates and polymorphism. You can build up a list of pointers to member functions of arbitrary classes and call them all through the same interface without bothering about their class or the need of a pointer to an instance. All the functions just have got to have the same return-type and calling parameters. Sometimes functors are also known as closures. You can also use functors to implement callbacks.

1

You are comparing functions on different level of abstraction.

You can implement CalculateAverage(begin, end) either as:

template<typename Iter>
double CalculateAverage(Iter begin, Iter end)
{
    return std::accumulate(begin, end, 0.0, std::plus<double>) / std::distance(begin, end)
}

or you can do it with a for loop

template<typename Iter>
double CalculateAverage(Iter begin, Iter end)
{
    double sum = 0;
    int count = 0;
    for(; begin != end; ++begin) {
        sum += *begin;
        ++count;
    }
    return sum / count;
}

The former requires you to know more things, but once you know them, is simpler and leaves fewer possibilities for error.

It also only uses two generic components (std::accumulate and std::plus), which is often the case in more complex case too. You can often have a simple, universal functor (or function; plain old function can act as functor) and simply combine it with whatever algorithm you need.

Jan Hudec
  • 73,652
  • 13
  • 125
  • 172