10

I thought about this: Is there a performance difference in these two practices:

  1. Store the return value of a function in a temporary variable than give that variable as a parameter to another function.
  2. Put the function into the other function.

Specification

Assuming all classes and functions are written correctly.

Case 1.

ClassA a = function1();
ClassB b = function2(a);
function3(b);

Case 2.

function3(function2(function1()));

I know there aren't a big difference with only one run, but supposed that we could run this a lot of times in a loop, I created some tests.

Test

#include <iostream>
#include <ctime>
#include <math.h>
using namespace std;

int main()
{
   clock_t start = clock();
   clock_t ends = clock();

   // Case 1.
   start = clock();
   for (int i=0; i<10000000; i++)
   {
      double a = cos(1);
      double b = pow(a, 2);
      sqrt(b);
   }
   ends = clock();
   cout << (double) (ends - start) / CLOCKS_PER_SEC << endl;

   // Case 2.
   start = clock();
   for (int i=0; i<10000000; i++)
      sqrt(pow(cos(1),2));
   ends = clock();
   cout << (double) (ends - start) / CLOCKS_PER_SEC << endl;
   return 0;
}

Results

  • Case 1 = 6.375
  • Case 2 = 0.031

Why is the first one is much slower, and if the second one is faster why dont we always write code that way? Anyway does the second pratice has a name?
I also wondered what happens if I create the variables outside the for loop in the first case, but the result was the same. Why?

totymedli
  • 29,531
  • 22
  • 131
  • 165

2 Answers2

4

Break the throw-this-all-away optimization if you want the computational crunch and your numbers become much more consistent. Ensuring the code to get the proper value is actually run and not entirely thrown out, I've assigned the results in both tests to a volatile local (which isn't exactly proper usage of volatile, but does a decent job of ensuring only the value-creation is the significant delta).

#include <iostream>
#include <ctime>
#include <cmath>
using namespace std;

int main()
{
    clock_t start;
    volatile double val;

    for (int j=1;j<=10;j++)
    {
        // Case 1.
        start = clock();
        for (int i=0; i<2000000; i++)
        {
            double a = cos(1);
            double b = pow(a, 2);
            val = sqrt(b);
        }
        cout << j << ':' << (double) (clock() - start) / CLOCKS_PER_SEC << endl;

        // Case 2.
        start = clock();
        for (int i=0; i<2000000; i++)
            val = sqrt(pow(cos(1),2));
        cout << j << ':' << (double) (clock() - start) / CLOCKS_PER_SEC << endl << endl;
    }
    return 0;
}

Produces the following release-compiled output on my Macbook Air (which is no speed demon by any stretch):

1:0.001465
1:0.001305

2:0.001292
2:0.001424

3:0.001297
3:0.001351

4:0.001366
4:0.001342

5:0.001196
5:0.001376

6:0.001341
6:0.001303

7:0.001396
7:0.001422

8:0.001429
8:0.001427

9:0.001408
9:0.001398

10:0.001317
10:0.001353
WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • I see what you are telling, but isn't this reinforcing that we should write code like in case 2 becouse of the compiler optimalization? – totymedli Dec 19 '12 at 17:24
  • 2
    @totymedli Not really, One could easily argue that the clarity of the first brings more to the developer's eyes than the second, and since the optimizer throws out `a` and `b` anyway (in this case) the end result is the same. How/Whether each is good practice or bad for non-trivial value-types is an issue that people with a *much* better understanding of RVO (return value-optimiation) than I can probably better answer. For this particular case, either identical or near-identical code is likely produced. – WhozCraig Dec 19 '12 at 17:34
0

A proper and legal full optimization of both loops above is "do not even do the loop". You could easily be seeing a case where you have confused the compiler by using an uninitialized variable in the first case, or maybe your use of variables confuses it, or maybe your optimization level forces named variables to actually exist.

Now there is a difference between the two in C++11 involving implicit moves of temporary variables, but you can fix this with use of std::move. (I am not sure, but the last use of a local variable that is going out of scope may qualify for implicit move). For a double this is not a difference, but for more complex types this can be.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • The last use of a variable that is going out of scope does not qualify for an implicit move. Implicit moves only occur in places where copy elision could occur -- `return`s of local variables (where the type of the variable matches the type of the return value), `throw`s of variables that go out of scope before the `catch`, copies of temporaries and `catch`s of exceptions (where the type of the exception matches exactly). – Mankarse Dec 19 '12 at 17:18