1

I don't like to repeat myself in code, but also I don't want to lose performance by simple functions. Suppose the class has operator+ and function Add with same functionality (considering former as handy way of using class in expressions and latter as "expilicit" way to do so)

struct Obj {
   Obj operator+(float);
   Obj Add(float);
   /* some other state and behaviour */
};

Obj AddDetails(Obj const& a, float b) {
   return Obj(a.float_val + b, a.some_other_stuff);
}

Obj Obj::operator+(float b) {
   return AddDetails(*this, b);
}

Obj Obj::Add(float b) {
   return AddDetails(*this, b);
}

For the purpose of making changes easier both functions are implemented with auxiliary function call. Therefore, any call to operator makes 2 calls what is not really pleasant.

But is compiler smart enough to eliminate such double calls?

I tested with simple classes (that contain built-in types and pointers) and optimizer just doesn't calculate something not needed, but how does it behave in large systems (with hot calls especially)?

If this is where RVO takes place, then does it work in larger sequences of calls (3-4) to fold it in 1 call?

P.S. Yes, yes, premature optimization is the root of all evil, but still I want an answer

Alexey S. Larionov
  • 6,555
  • 1
  • 18
  • 37
  • 2
    Obviously depends on the compiler, I would imagine yes, but you'd have to test it with your compiler. – john Feb 27 '19 at 16:42
  • 3
    You could check on https://godbolt.org/ – drescherjm Feb 27 '19 at 16:42
  • 3
    Build with optimization enabled, and look at the generated assembly code. It could be that the compiler inlines your calls. – Some programmer dude Feb 27 '19 at 16:43
  • If code is split in several TUs, it would be harder and require LTO. – Jarod42 Feb 27 '19 at 16:46
  • Defining your functions within your class, or in the header with the `inline` keyword, will improve the chances that this will be optimized. – 1201ProgramAlarm Feb 27 '19 at 16:49
  • @1201ProgramAlarm inlining is kind of unreliable. From my point of view compiler can see the opportunity for some kind of RVO, but does it really? – Alexey S. Larionov Feb 27 '19 at 16:52
  • Any compiler worth it's price will inline those calls. – SergeyA Feb 27 '19 at 16:53
  • @AlexLarionov If every Translation Unit has the definition of those functions they can be easily optimized. If they are only defined in a .CPP file you have to rely on Link Time Optimization which is a more involved process. – 1201ProgramAlarm Feb 27 '19 at 16:54
  • "Is optimization applied to single-line functions?" - Just as much as they are applied to multi-line functions. The number of *C++ source code* lines has no bearing on what optimizations the compiler will use or not use (well, that's a simplification, but *in general* true). And if you add LTO into the mix, then certainly (one line) functions can be optimized even more. – Jesper Juhl Feb 27 '19 at 16:59
  • @Jarod42 I don't think copy elision requires LTO. Larger returned objects get allocated on the caller's stack (depends on the calling conventions, but that's what happens on 64-bit Windows) and the callee gets the address as an argument. The job of copy elision is to use this address for all the returned objects on the call chain, which is unrelated to inlining, LTO and stuff like that. – Alexey B. Feb 27 '19 at 17:03
  • Why would the compiler treat 1 line functions special? They are just functions. And the single line (I'm guessing "statement") they contain can involve some *very* complex code that requires *way more* code-gen and optimization than a simple multi line (statement) function. The compiler has *no reason* to specialize the one-line case. – Jesper Juhl Feb 27 '19 at 17:05
  • @SergeyA Not if `AddDetails` isn't at least declared with static scope. If exported from TU, cost model for code size can already block inlining based on overhead for parameter expansion. Constructor, operators and `AddDetails` declared in same TU, with `Obj` having default move constructor would be required for almost guaranteed inlining. – Ext3h Feb 27 '19 at 17:17
  • @AlexeyB.: Indeed copy elision is special, but I read the question more generally (as `assign`/`operator=` or comparison operators where (N)RVO doesn't apply). – Jarod42 Feb 27 '19 at 17:46

2 Answers2

2

Overall

Yes See the instructions clang generated on https://godbolt.org/z/VB23-W Line 21

   movsd   xmm0, qword ptr [rsp]   # xmm0 = mem[0],zero
   addsd   xmm0, qword ptr [rip + .LCPI3_0]

it just takes the applies the code of AddDetails directly instead of even calling your operator+. This is called inlining and worked even for this chain of value returning calls.

Details

Not only RVO optimisation can happen to single line functions but every other optimisation including inlining see https://godbolt.org/z/miX3u1 and https://godbolt.org/z/tNaSW .

Look at this you can see gcc and clang heavily optimises even the non inlined declared code, ( https://godbolt.org/z/8Wf3oR )

#include <iostream>

struct Obj {
    Obj(double val) : float_val(val) {}
    Obj operator+(float b) {
        return AddDetails(*this, b);
    }
    Obj Add(float b) {
        return AddDetails(*this, b);
    }
    double val() const {
        return float_val;
    }
private:
    double float_val{0};
    static inline Obj AddDetails(Obj const& a, float b);
};

Obj Obj::AddDetails(Obj const& a, float b) {
    return Obj(a.float_val + b);
}


int main() {
    Obj foo{32};
    Obj bar{foo + 1337};
    std::cout << bar.val() << "\n";
}

Even without inlining no extra C-Tor Calls can be seen with

#include <iostream>

struct Obj {
    Obj(double val) : float_val(val) {}
    Obj operator+(float);
    Obj Add(float);
    double val() const {
        return float_val;
    }
private:
    double float_val{0};
    static Obj AddDetails(Obj const& a, float b);
};

Obj Obj::AddDetails(Obj const& a, float b) {
    return Obj(a.float_val + b);
}

Obj Obj::operator+(float b) {
    return AddDetails(*this, b);
}

Obj Obj::Add(float b) {
    return AddDetails(*this, b);
}

int main() {
    Obj foo{32};
    Obj bar{foo + 1337};
    std::cout << bar.val() << "\n";
}

However some of the optimisation is done due to the compiler knowing that the value won't change so lets change the main to

int main() {
    double d{};
    std::cin >> d;
    Obj foo{d};
    Obj bar{foo + 1337};
    std::cout << bar.val() << "\n";
}

But then you can still see the optimisations on both compilers https://godbolt.org/z/M2jaSH and https://godbolt.org/z/OyQfJI

Superlokkus
  • 4,731
  • 1
  • 25
  • 57
0

From what I understand, modern compilers are required to apply copy elision in your cases. According to https://en.cppreference.com/w/cpp/language/copy_elision, when you write return Obj(a.float_val + b, a.some_other_stuff), the constructor call is a prvalue; returning it will not create a temporary object, and so no move or copy will happen.

Alexey B.
  • 1,106
  • 8
  • 17