Should we use temporary variables for the returned values of functions?

Question

I thought about this: Is there a performance difference in these two practices:

Store the return value of a function in a temporary variable than give that variable as a parameter to another function.
Put the function into the other function.

Specification

Assuming all classes and functions are written correctly.

Case 1.

ClassA a = function1();
ClassB b = function2(a);
function3(b);

Case 2.

function3(function2(function1()));

I know there aren't a big difference with only one run, but supposed that we could run this a lot of times in a loop, I created some tests.

Test

#include <iostream>
#include <ctime>
#include <math.h>
using namespace std;

int main()
{
   clock_t start = clock();
   clock_t ends = clock();

   // Case 1.
   start = clock();
   for (int i=0; i<10000000; i++)
   {
      double a = cos(1);
      double b = pow(a, 2);
      sqrt(b);
   }
   ends = clock();
   cout << (double) (ends - start) / CLOCKS_PER_SEC << endl;

   // Case 2.
   start = clock();
   for (int i=0; i<10000000; i++)
      sqrt(pow(cos(1),2));
   ends = clock();
   cout << (double) (ends - start) / CLOCKS_PER_SEC << endl;
   return 0;
}

Results

Case 1 = 6.375
Case 2 = 0.031

Why is the first one is much slower, and if the second one is faster why dont we always write code that way? Anyway does the second pratice has a name?
I also wondered what happens if I create the variables outside the for loop in the first case, but the result was the same. Why?

What exactly is *this* : `double b = pow(b, 2);` supposed to be doing?? — WhozCraig, Dec 19 '12 at 16:54
An optimizing compiler should be able to make the two cases identical.. except where you invoke UB in the first. — Puppy, Dec 19 '12 at 16:55
@Linuxious It won't be 0. `b` is uninitialized at that point. — Joseph Mansfield, Dec 19 '12 at 16:56
Maybe in the second case, the function calls aren't even made, because their return value is absolutely unused. — , Dec 19 '12 at 16:58
@Linuxios: The result is never taken. It's well within the reach of a normal optimizing compiler to remove function calls which are intrinsic like this, so it knows no side effects, but the value is not taken. — Puppy, Dec 19 '12 at 16:59
@Linuxios `sqrt`, `sin` and `cos` are builtins, probably the compiler knows that they don't have any side effect, so `sqrt(pow(cos(1),2));` is equivalent to (for example) `((void)0);` - an expression with no effect, just a value hanging in the air... — , Dec 19 '12 at 16:59
@linuxios sqrt(pow(cos(1),2)); is probably ignored by the compiler as the result is not used. — undu, Dec 19 '12 at 17:00
The proper full optimization of both loops is "do not do the loop". You need a better test. — Yakk - Adam Nevraumont, Dec 19 '12 at 17:00
@Linuxios thanks for the fix. I recompiled the code and the first case is getting twice slower. — totymedli, Dec 19 '12 at 17:05
@interjay he must be. its the only reasonable explanation for the radical differences, and if final-value-volatility is enforced (i.e. cannot be optimized away), the end-numbers are virtually identical. — WhozCraig, Dec 19 '12 at 17:13
Ok I understand the optimalization made by the compiler, but then why the two practice (described in the specification part) is the same? And with the optimalization shouldn't we use the second one? — totymedli, Dec 19 '12 at 17:33
On a related note, if you're worried about copying an object (something bigger than an intrinsic) into a temporary, just to pass it into another function, you can assign the return of a function to a const reference: `const ClassA &a = function1();` where the signature of `function1()` is `ClassA function1();`. With return value optimization, you can eliminate the temporary all together. If `ClassA` has a move constructor, `ClassA a = function1()` should also eliminate the need for the copy. — Anthony, Dec 20 '12 at 01:21

WhozCraig · Accepted Answer · 2012-12-20T00:34:45.333

Break the throw-this-all-away optimization if you want the computational crunch and your numbers become much more consistent. Ensuring the code to get the proper value is actually run and not entirely thrown out, I've assigned the results in both tests to a volatile local (which isn't exactly proper usage of volatile, but does a decent job of ensuring only the value-creation is the significant delta).

#include <iostream>
#include <ctime>
#include <cmath>
using namespace std;

int main()
{
    clock_t start;
    volatile double val;

    for (int j=1;j<=10;j++)
    {
        // Case 1.
        start = clock();
        for (int i=0; i<2000000; i++)
        {
            double a = cos(1);
            double b = pow(a, 2);
            val = sqrt(b);
        }
        cout << j << ':' << (double) (clock() - start) / CLOCKS_PER_SEC << endl;

        // Case 2.
        start = clock();
        for (int i=0; i<2000000; i++)
            val = sqrt(pow(cos(1),2));
        cout << j << ':' << (double) (clock() - start) / CLOCKS_PER_SEC << endl << endl;
    }
    return 0;
}

Produces the following release-compiled output on my Macbook Air (which is no speed demon by any stretch):

1:0.001465
1:0.001305

2:0.001292
2:0.001424

3:0.001297
3:0.001351

4:0.001366
4:0.001342

5:0.001196
5:0.001376

6:0.001341
6:0.001303

7:0.001396
7:0.001422

8:0.001429
8:0.001427

9:0.001408
9:0.001398

10:0.001317
10:0.001353

I see what you are telling, but isn't this reinforcing that we should write code like in case 2 becouse of the compiler optimalization? — totymedli, Dec 19 '12 at 17:24
@totymedli Not really, One could easily argue that the clarity of the first brings more to the developer's eyes than the second, and since the optimizer throws out `a` and `b` anyway (in this case) the end result is the same. How/Whether each is good practice or bad for non-trivial value-types is an issue that people with a *much* better understanding of RVO (return value-optimiation) than I can probably better answer. For this particular case, either identical or near-identical code is likely produced. — WhozCraig, Dec 19 '12 at 17:34

score 0 · Answer 2 · answered Dec 19 '12 at 17:10

A proper and legal full optimization of both loops above is "do not even do the loop". You could easily be seeing a case where you have confused the compiler by using an uninitialized variable in the first case, or maybe your use of variables confuses it, or maybe your optimization level forces named variables to actually exist.

Now there is a difference between the two in C++11 involving implicit moves of temporary variables, but you can fix this with use of std::move. (I am not sure, but the last use of a local variable that is going out of scope may qualify for implicit move). For a double this is not a difference, but for more complex types this can be.

The last use of a variable that is going out of scope does not qualify for an implicit move. Implicit moves only occur in places where copy elision could occur -- `return`s of local variables (where the type of the variable matches the type of the return value), `throw`s of variables that go out of scope before the `catch`, copies of temporaries and `catch`s of exceptions (where the type of the exception matches exactly). — Mankarse, Dec 19 '12 at 17:18

Should we use temporary variables for the returned values of functions?

Specification

Case 1.

Case 2.

Test

Results

2 Answers2

Linked

Related