8

Suppose I declare a function foo(int arg1, int arg2 = 0, int arg3 = 0, int arg4 = 0). The last three arguments will be specified only occasionally (if ever), and mostly, the funciton will be called as foo(some_int). Would I gain performance by instead declaring the function as foo(int arg1), and having a different solution for passing the other arguments if they are really needed?

In other words, do declared but unspecified default arguments make a function call slower?

The function in this case is a constructor for an object, but it's a general question.

Ludwik
  • 2,247
  • 2
  • 27
  • 45
  • 2
    Depends. If you pass complex objects by value, it adds to the overhead. If your code is inlined unused arguments may be optimized out. – JarkkoL Jun 18 '14 at 12:38

2 Answers2

5

(You can just read the conclusion at the end if you like)

I did a benchmark to test this, I first ran this short program about ten times:

#include <iostream>
#include <ctime>

using namespace std;

int returnMe(int me)
{
    return me;
}


int main()
{
   float begin = (float)clock();
   for(int i = 0; i < 100000000; i++)
   {
       int me = returnMe(i);
   }
   printf("\nTime: %f\n", begin);
   printf("\nTime: %f\n", (float)clock());

   return 0;
}

Which basically executes the function returnMe a hundred million times, and then tells me how long that took. Values ranged from 280 ms to 318ms. Then I ran this program:

#include <iostream>
#include <ctime>

using namespace std;

int returnMe(int me, int me87 = 0, int m8e = 0, int m5e = 0, int m34e = 0,int m1e = 0,int me234 = 0,int me332 = 0,int me43 = 0,int me34 = 0,int me3 = 0,int me2 = 0,int me1 = 0)
{
    return me;
}


int main()
{
   float begin = (float)clock();
   for(int i = 0; i < 100000000; i++)
   {
       int me = returnMe(i);
   }
   printf("\nTime: %f\n", begin);
   printf("\nTime: %f\n", (float)clock());

   return 0;
}

about ten times, and values were now ranging from 584 ms to 624 ms.

Conclusion: Yes, it will make the function call slower, but by a very small amount. Creating a separate function for passing the other arguments to the object, or having a different constructor, would be a performance gain, but would it be worth the extra code?

There is another way of solving it, used by Box2D, which is basically creating a separate struct for the default arguments, and passing a pointer to an instance of it. That way, when no extra arguments need to be set, the only "garbage argument" passed that decreases your performance is one nullpointer, and that is not so bad. When you want to specify some of the default values, you create an instance of said struct at stack, fill in the the values you want, then pass its address to the function. Easy, elegant and efficient.

However: Both proposed solutions for saving the performance (an extra function and passing a struct pointer) do require additional code. If your function will be called rarely, and the extra arguments are not that many, chances are the saved performance will not make any difference at all, and if that is the case, it's not worth your time. Only optimize if it's necessary. Remember I added 12 default arguments and didn't even double the function calling time.

======== EDIT: bonus for serious testing.

So the first two tests were done with plain simple compile command g++ test.cpp -o test.exe. As pointed out in numerous comments, that implies an optimization level of -O0. What results would we get from testing at -O3?

I repeated the tests now compiling with g++ test.cpp -o test.exe -O3, but found that the program was now completing in under 1-2 ms. I tried to vamp up the iterations to one trillion, then one hundred trillion, same result. So I figured g++ was probably seeing I was declaring a variable I was not going to use, and therefore probably skipping the calls to returnMe, and maybe the whole loop altogether.

To get some useful results, I added actual functionality to returnMe, to make sure that it was not optimized away. Here are the programs used:

#include <iostream>
#include <ctime>

using namespace std;

long long signed int bar = 0;

int returnMe(int me)
{
    bar -= me;
    return me;
}


int main()
{
   float begin = (float)clock();
   for(int i = 0; i < 1000000000; i++)
   {
       int me = returnMe(i);
       bar -= me * 2;
   }
   printf("\nTime: %f\n", begin);
   printf("\nTime: %f\n", (float)clock());
   printf("Bar: %i\n", bar);

   return 0;
}

and

#include <iostream>
#include <ctime>

using namespace std;

long long signed int bar = 0;

int returnMe(int me, int me87 = 0, int m8e = 0, int m5e = 0, int m34e = 0,int m1e = 0,int me234 = 0,int me332 = 0,int me43 = 0,int me34 = 0,int me3 = 0,int me2 = 0,int me1 = 0)
{
    bar -= me;
    return me;
}

int main()
{
   float begin = (float)clock();
   for(int i = 0; i < 1000000000; i++)
   {
       int me = returnMe(i);
       bar -= me * 2;
   }
   printf("\nTime: %f\n", begin);
   printf("\nTime: %f\n", (float)clock());
   printf("Bar: %i\n", bar);

   return 0;
}

Results:

First program: from 653 to 686 ms

Second program: from 652 to 735 ms

As I expected, the second program is still slower than the first, but the difference is now less noticeable.

Ludwik
  • 2,247
  • 2
  • 27
  • 45
  • 3
    What optimization level and compiler was this with? – ghostofstandardspast Jun 18 '14 at 12:37
  • `returnMe` does nothing. Do something more complex, and see again. The values will be comparable – BЈовић Jun 18 '14 at 12:38
  • g++, no additional arguments so "default" optimization level I guess. Will edit to add the info. – Ludwik Jun 18 '14 at 12:39
  • 7
    @Ludwik You should test with -O3 or at least -O2. Never test performance without optimizations enabled, since you should never release without optimizations, it is pointless to test performance without it. – fbafelipe Jun 18 '14 at 12:41
  • I think it would be a fairer comparison if you had the first piece of code using a real alternative to the default parameters. – Dan Jun 18 '14 at 12:41
  • It was a quick test, but now that you mention it, I will test with optimizations on, and report back what happened. – Ludwik Jun 18 '14 at 12:43
  • Whoa, -O3 made the first program run in 2 ms! Gonna have to add more iterations. – Ludwik Jun 18 '14 at 12:46
  • I have a feeling -O3 is just optimizing away what I'm doing. Gonna have to add some actual functionality to returnMe. – Ludwik Jun 18 '14 at 12:49
  • 1
    @Ludwik, any modern optimizing compiler will eliminate your call to `returnMe()` and even the `for` loop itself. You're doing wrong this benchmark. – qehgt Jun 18 '14 at 13:00
  • As the question says all default values are `0`, I suspect the implementation has some code of the form `if(arg0){do stuff};`, which would make your benchmark a bit unfair. Same holds if these arguments are multipliers/weights e.g. `return weight0*a+weight1*b`. – Lanting Jun 18 '14 at 13:05
  • The implementation was made to test only the general question "do declared but unspecified default arguments make a function call slower?". @Lanting, you're right, in a real implementation the other arguments would be checked in the function, but I figured adding that here would only defer from that question. – Ludwik Jun 18 '14 at 13:21
  • Edited to add a new benchmark, taking into account all the comments. Thank you all for making me better at benchmarking. – Ludwik Jun 18 '14 at 13:23
  • 2
    Maybe you have something running, that slowed down execution of the 2nd program. – BЈовић Jun 18 '14 at 13:27
  • @BЈовић It's certainly possible. The tests were ran straight after each other, and what was running during the second was running during the first too (what was running was: browser, codeBlocks, sublime text, file explorer), but sure it's possible the browser started loading a file or something else happened in the background. However, I'm not so serious about this as to turn off everything else and all background tasks. Do you think the result is faulty, and the programs should execute at equal speeds? – Ludwik Jun 18 '14 at 13:33
  • I am saying the difference is neglectable, if any – BЈовић Jun 18 '14 at 21:45
  • 3
    For benchmarking you should pick the smallest time of all the runs (i.e. least intrusion from background processes), thus both of those programs are as efficient, as expected. You should also check the generated asm code to see if they are equivalent. – JarkkoL Jun 19 '14 at 00:37
1

It will depend on your compiler, the optimizations enabled and whether or not the function is inline.

If the function/constructor is inline, the compiler may optimize that out. If the function is not inline, it will have the values pushed to the stack every call, so it will have performance impact (significant or not).

But remember, premature optimization is the root of all evil. Do not just assume it will be a big deal and writing a less maintainable code to go around it before you run a profile and be sure that it need optimization.

fbafelipe
  • 4,862
  • 2
  • 25
  • 40