2

Most of the functions in <functional> use functors. If I write a struct like this:

struct Test
{
   bool operator()
   {
       //Something
   }
   //No member variables
};

Is there a perf hit? Would an object of Test be created? Or can the compiler optimize the object away?

nakiya
  • 14,063
  • 21
  • 79
  • 118
  • why not a simple function instead of a functor since there is no member variable in your functor? – Chubsdad Dec 02 '10 at 06:06
  • @Chubsdad: Because of the 'Trick' mentioned in this question's answer. http://stackoverflow.com/questions/442026/function-overloading-by-return-type – nakiya Dec 02 '10 at 06:08
  • I also asked this question: http://stackoverflow.com/q/4332286/57428 – sharptooth Dec 02 '10 at 06:25
  • @Chusbad: I had the idea that functors were actually faster because the code could be inlined in this case, and could not when using a pointer to function (like in C), but I am suddenly wondering if a template algorithm in C++ (like sort) could actually inline the function as well. Do you know about it ? – Matthieu M. Dec 02 '10 at 07:39
  • Is this a serious performance concern? Unless you are dynamically allocating objects of type Test, the overhead will be negligible in the grand scheme of things! I think you really should profile and if this is a bottleneck, then think of a different approach - else focus on more important things... – Nim Dec 02 '10 at 10:12

4 Answers4

3

Yes, the compiler can optimize "object creation" (which is trivial in this case) out if it wants so. However if you really care you should compile your program and inspect the assembly code.

sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • Unfortunately, I'm not familiar with assembly. – nakiya Dec 02 '10 at 06:06
  • 2
    @nakiya: Then it's time to get familiar with it. There's no more convenient way to find what the compiler actually did rather than inspecting the assembly code. Either this or just be optimistic and trust the compiler. – sharptooth Dec 02 '10 at 06:07
  • Try to make the smallest running example that does what you want. It should then not be that hard to figure out the assembly. It's your best bet to see what actually happens. – Bart Dec 02 '10 at 06:10
  • But use assembly with care. You really start treading on implementation defined behavior. – Chubsdad Dec 02 '10 at 06:11
  • @BKevelham: Exactly. However this "smallest running" is not necessary in Visual C++ - one can just put a breakpoint near code of interest and after the debugger stops there go and inspect the saambly code aligned side-by-side with C++ code - very convenient since there's no need to figure out where the code of interest is the the binary. I'm almost sure that other debuggers can do that too. – sharptooth Dec 02 '10 at 06:13
  • +1 to inspecting assembly code - different compilers optimize things differently (in fact, the same compiler will behave differently with different switches). In this case, I know of one compiler (BCC32) that does *not* optimize out the object creation. In fact, it performs value initialization on the object to further worsen performance. CL /Ox /EHsc does however optimize it out. – Zach Saw Dec 02 '10 at 06:23
3

Even if the compiler was having a bad day and somehow couldn't figure out how to optimize this (it's very simple as optimizations go) - with no data members and no constructor the "performance hit" to "create an object" would be at most one instruction (plus maybe a couple more to copy the object, if the compiler also doesn't figure out how to inline the function call that uses the functor) to increment the stack pointer (since every object must have a unique address). "Creating objects" is cheap. What takes time is allocating memory, via new (because the OS has to be petitioned for the memory, and it has to search for a contiguous block that isn't being used by something else). Putting things on the stack is trivial.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
3

GCC at least can optimize the object creation and inline your functor, so you can expect performance as with hand-crafted loop. Of cource you must compile with -O2.

Begemoth
  • 1,389
  • 9
  • 14
0

There is no "use" of the structure, so as the code currently stands, it is still just a definition (and takes up no space).

If you create an object of type Test, it will take up non-zero space. If the compiler can deduce that nothing takes its address (or anything similar), it is free to optimize away the space usage.

lijie
  • 4,811
  • 22
  • 26