10

I often see functions where other functions are called multiple times instead of storing the result of the function once.

i.e (1):

void ExampleFunction()
{ 
    if (TestFunction() > x || TestFunction() < y || TestFunction() == z)
    {
        a = TestFunction();
        return; 
    }
    b = TestFunction();
}

Instead I would write it that way, (2):

void ExampleFunction()
{
    int test = TestFunction();
    if (test > x || test < y || test == z)
    {
        a = test;
        return;
    }
    b = test;
}

I think version 2 is much better to read and better to debug. But I'm wondering why people do it like in number 1? Is there anything I don't see? Performance Issue? When I look at it, I see in the worst case 4 function calls in number (1) instead of 1 function call in number (2), so performance should be worse in number (1), shouldn't it?

SingerOfTheFall
  • 29,228
  • 8
  • 68
  • 105
DanielG
  • 1,217
  • 1
  • 16
  • 37
  • 10
    Ask the person who wrote the code! – Kerrek SB Sep 19 '12 at 12:10
  • 2
    The compiler will probably optimize the code to be like your alternative two, so the performance is a non-issue. My self, I find the first alternative more readable actually. – Some programmer dude Sep 19 '12 at 12:10
  • 6
    there could be side effects that means the function has to be called everytime. So the answer to your question would be: it depends. – default Sep 19 '12 at 12:12
  • ok, that makes sense. but when there are no side effects, I can do either way, and performace should be the same in the end? – DanielG Sep 19 '12 at 12:17
  • 4
    @JoachimPileborg I don't believe a compiler will do this optimization, unless `TestFunction` unconditionally returns a compile-time constant or similar. As already mentioned, there might be side effects or the returned value could change every time you call it. The optimization would change the program's behavior in these cases. – reima Sep 19 '12 at 12:21
  • @user1682946 I'd say that thinking about it would be a matter of pre-optimization. If you consider that this function call takes too much time, then do it like the second example. If not, pick the one that is most clear to you. I'd as well go with version two, but that might not be everyones taste. – default Sep 19 '12 at 12:25
  • 1
    @Default: In fact, it follows that the two pieces of code in general *do different things*. So the question isn't really legit without additional information. – Kerrek SB Sep 19 '12 at 12:37
  • @JoachimPileborg: The compiler can only optimize the calls away if the function can be inlined (i.e. if it can see into the definition of the function) and can prove that there are no side effects. This is one of the tricky situations. In gcc, `strlen` is tagged with an attribute to inform the compiler of this situation, and the compiler can optimize multiple calls if neither the pointer nor the contents pointed change in a loop. But if you compile the same code in VS then the compiler will not optimize the call and you will end up with multiple expensive operations... – David Rodríguez - dribeas Sep 19 '12 at 12:40
  • @David: that's not exactly "and" -- if the compiler can assume no side-effects thanks to attributes, then it doesn't have to see into the function definition, and the "proof" of no side-effects is quite short. – Steve Jessop Sep 19 '12 at 12:44
  • @SteveJessop: I guess the comment is a bit confusing... yes, that is precisely the case of `strlen` in gcc. The definition is not available, but it is tagged with an attribute (`__pure__`?) that tells the compiler that the result of the function depends only on the pointer passed in. That is, the condition is that either it can *prove* that there are no side effects and the result will be the same in multiple calls, or else the code *tells* it that this is the case. – David Rodríguez - dribeas Sep 19 '12 at 12:45
  • @DavidRodríguez-dribeas: Such optimizations can also be performed by the linker, so the definition does not need to be available during compilation. However, the issue of proving that a function is free of side-effect remains, but I think that optimizers are smart enough to figure that out in simple cases. – Luc Touraille Sep 19 '12 at 12:46
  • @LucTouraille: I am not sure that optimization can be done by the linker. The linker can decide to *inline* the function even if it is not visible to the compiler in that translation unit. I very much doubt that the linker is going to analyze each and every function to determine whether multiple function calls will yield the same result. Of course, if the linker determines that the cost of the function is minimal and it should be inlined, then it usually calls the compiler with that information, and the compiler can then *see* the definition and it could potentially optimize. – David Rodríguez - dribeas Sep 19 '12 at 12:50
  • @David: the definition of "purity" for `strlen` is actually quite interesting. As you said, it returns the same value provided that the contents of the address passed haven't changed. A function could take a `char*` but the result depends only on the address, not the contents. At a stretch, a function could take a `char*` that points to the first byte of a pointer, in which case it would require that *two* indirection levels of "contents" remain constant. So even the meaning of the attribute isn't simple, let alone using it to optimize :-) – Steve Jessop Sep 19 '12 at 12:50
  • @David again: if the linker does inline the call as part of a whole program optimization, you'd kind of hope that it also does a common sub-expression pass some time thereafter, and that's what results in the equivalent of code (2). But you're right, just because the linker inlines doesn't necessarily mean it does all optimizations. – Steve Jessop Sep 19 '12 at 12:52
  • ... but unless the function is trivial, the linker will not call the compiler again, and the compiler will not have a chance to optimize that. Again `strlen` is a good example. As it contains a loop the heuristic in compilers tend not to inline it (I am not sure whether the linker in *whole program optimization* would consider `strlen` as a candidate for inlining, but it might not be the case), which means that multiple calls will not be optimized (Disclaimer: take all *will* and *won't* in the comment as *I believe it will/won't*, I don't know all details there :)) – David Rodríguez - dribeas Sep 19 '12 at 12:53
  • @SteveJessop: Usually the linker does not inline, but rather calls the compiler/optimizer with the information it has. The point I was trying to make is that the decision of reprocessing a function to inline some of the calls is taken without analyzing whether multiple calls will yield the same result (i.e. I would not expect the linker to perform static analysis). As a matter of fact, the multiple calls may actually hint the linker into not inlining (if the function is called only once in the program, it is a great candidate for inlining (no matter how large the function is)... – David Rodríguez - dribeas Sep 19 '12 at 12:57
  • ... but if there are multiple (many?) calls to it, then inlining may make the code size much larger having an overall negative effect on performance, and thus the compiler/linker might prefer to leave the function call. – David Rodríguez - dribeas Sep 19 '12 at 12:58
  • @David: yeah, when I say "the linker does it", that includes "the linker uses the compiler to do it". I think Luc probably meant the same. I agree with you that whole-program optimization isn't quite as simple as the linker doing a few optimizations, but it amounts to "link-time optimization" or "optimization by the linker" all the same, regardless of whether the linker itself actually has any brains :-) And like you say, even if the techniques are there to make it happen, that doesn't necessarily mean the heuristics will do it. – Steve Jessop Sep 19 '12 at 13:00

5 Answers5

3

I'd use (2) if I wanted to emphasize that the same value is used throughout the code, or if I wanted to emphasize that the type of that value is int. Emphasizing things that are true but not obvious can assist readers to understand the code quickly.

I'd use (1) if I didn't want to emphasize either of those things, especially if they weren't true, or if the number of times that TestFunction() is called is important due to side-effects.

Obviously if you emphasize something that's currently true, but then in future TestFunction() changes and it becomes false, then you have a bug. So I'd also want either to have control of TestFunction() myself, or to have some confidence in the author's plans for future compatibility. Often that confidence is easy: if TestFunction() returns the number of CPUs then you're happy to take a snapshot of the value, and you're also reasonably happy to store it in an int regardless of what type it actually returns. You have to have minimal confidence in future compatibility to use a function at all, e.g. be confident that it won't in future return the number of keyboards. But different people sometimes have different ideas what's a "breaking change", especially when the interface isn't documented precisely. So the repeated calls to TestFunction() might sometimes be a kind of defensive programming.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • 1
    If the "emphasis on `int`" is to be removed, `auto` can be used instead. – Fred Foo Sep 19 '12 at 13:01
  • 1
    @larsmans: yep, or even `auto&` in case `TestFunction` returns a reference. – Steve Jessop Sep 19 '12 at 13:05
  • One possible reason to emphasize it is that when reading a comparison, you need to know whether the types involved are signed, unsigned, or one of each. So that's a reason to emphasize the type rather than leave it to `TestFunction`. Even assuming the reader is in an IDE and can trivially look up the types, they might benefit from knowing that *you* know what type you're expecting. `auto` is very useful, and naming the type is also very useful, each in their place. – Steve Jessop Sep 19 '12 at 13:25
3

When a temporary is used to store the result of a very simple expression like this one, it can be argued that the temporary introduces unecessary noise that should be eliminated.

In his book "Refactoring: Improving the Design of Existing Code", Martin Fowler lists this elimination of temporaries as a possibly beneficial refactoring (Inline temp).

Whether or not this is a good idea depends on many aspects:

  • Does the temporary provides more information than the original expression, for example through a meaningful name?
  • Is performance important? As you noted, the second version without temporary might be more efficient (most compilers should be able to optimize such code so that the function is called only once, assuming it is free of side-effects).
  • Is the temporary modified later in the function? (If not, it should probably be const)
  • etc.

In the end, the choice to introduce or remove such temporary is a decision that should be made on a case by case basis. If it makes the code more readable, leave it. If it is just noise, remove it. In your particular example, I would say that the temporary does not add much, but this is hard to tell without knowing the real names used in your actual code, and you may feel otherwise.

Luc Touraille
  • 79,925
  • 15
  • 92
  • 137
2

The second option is clearly superior.

You want to emphasize and ensure that you have three times the same value in the if-statement.

Performance should not be a bottleneck in this example. In conclusion minimizing the chance for errors plus emphasize same values are much more important then a potential small performance gain.

Ini
  • 548
  • 7
  • 19
  • I would agree here, especially for the example given by the OP. Other answers are certainly informative and applicable in many cases, but in the OP's example it appears quite obvious that a value (returned by the function) meets three conditions. If the function does not return the same value each call, then this code would be EXTREMELY confusing in case (1). +1 for this answer. – Harvey Jul 26 '19 at 16:33
1

I think version 2 is much better to read and better to debug.

Agreed.

so performance should be worse in number (1), shouldn't it?

Not necessarily. If TestFunction is small enough, then the compiler may decide to optimize the multiple calls away. In other cases, whether performance matters depends on how often ExampleFunction is called. If not often, then optimize for maintainability.

Also, TestFunction may have side-effects, but in that case, the code or comments should make that clear in some way.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • ok thanks, that makes sense to me. I thought of functions with no side effects anyway. So I think I will go with number (2). Since it is better to read and debug, and performance may be better (if it can not be inlined). – DanielG Sep 19 '12 at 12:24
  • As far as I know, compilers now include [Link-time optimizations](http://gcc.gnu.org/projects/lto/lto.pdf), meaning that they should be able to optimize such code even if `TestFunction` is defined in a different translation unit than `ExampleFunction`. – Luc Touraille Sep 19 '12 at 12:39
  • @LucTouraille: I thought that was pretty much limited to Intel C. Thanks! – Fred Foo Sep 19 '12 at 13:00
1

The two are not equivalent. Take for example:

int TestFunction()
{
   static int x;
   return x++;
}

In a sane world though, this wouldn't be the case, and I agree that the second version is better. :)

If the function, for some reason, can't be inlined, the second will even be more efficient.

Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625