6

Consider the following code with a template method design pattern:

class A {
    public:
        void templateMethod() {
            doSomething();
        }
    private:
        virtual void doSomething() {
            std::cout << “42\n”;
        }
};
class B : public A {
    private:
        void doSomething() override {
            std::cout << “43\n”;
        }
};

int main() {
    // case 1
    A a; // value semantics
    a.templateMethod(); // knows at compile time that A::doSomething() must be called

    // case 2
    B b; // value semantics
    b.templateMethod(); // knows at compile time that B::doSomething() must be called

    // case 3
    A& a_or_b_ref = runtime_condition() ? a : b;  // ref semantics 
    a_or_b_ref.templateMethod(); // does not know which doSomething() at compile time, a virtual call is needed
    return 0;
}

I am wondering if the compiler is able to inline/unvirtualize the “doSomething()” member function in case 1 and 2. This is possible if it creates 3 different pieces of binary code for templateMethod(): one with no inline, and 2 with either A::doSomething() or B::doSomething() inlined (that must be called respectively in cases 3, 1 and 2)

Do you know if this optimization is required by the standard, or else if any compiler implements it ? I know that I can achive the same kind of effect with a CRT pattern and no virtual, but the intent will be less clear.

Bérenger
  • 2,678
  • 2
  • 21
  • 42
  • It seems to me that even with aggressive optimization most of compilers will fail with inlining static versions, because there can be many examples of functions with the same signature that can't be inlined. For example, if you had some external memory accessing in your fucions : 'cout << *p', where 'p' is member of class. Signature of doSomething() is the same with your example but inlining can't be done. But it's just an opinion. – Arsenii Fomin Dec 18 '14 at 05:20
  • Well maybe I am wrong but I always thought that any function could technically be inlined without condition provided it is neither virtual nor recursive. In your example, I don't see why the compiler would not be able to inline if the function is non-virtual. – Bérenger Dec 18 '14 at 05:27
  • Hm, I always imaginge that inline is some sort of pasting the code directly, so technically you have code with 'cout << *(this->p)' string. You need information about 'this' pointer in that code, but with inlining you will miss it. Am I wrong? – Arsenii Fomin Dec 18 '14 at 05:37
  • 1
    case 3 can also be done at compile time as `b_ref` is a simple alias of `b`. Something like `A& b_ref = runtime_condition() ? a : static_cast(b);` would require virtual call. – Jarod42 Dec 18 '14 at 08:13
  • @FominArseniy For the compiler, "this" is just an argument. So except for virtual ones, member functions are technically not different from regular functions. – Bérenger Dec 18 '14 at 15:53
  • @Jarod42 Yes this is the kind of example I am refering to. I will edit my post based on your suggestion, but I think the static_cast is not necessary, right ? – Bérenger Dec 18 '14 at 15:55
  • @Bérenger: true, static_cast not needed. – Jarod42 Dec 18 '14 at 16:40
  • Virtual function calls are not that expensive anyway. – Alan Stokes Dec 18 '14 at 20:52

2 Answers2

1

The standard does not require optimisations in general (occasionally it goes out of its way to allow them); it specifies the outcome and it is up to the compiler to figure out how best to achieve it.

In all three cases I would expect templateMethod to be inlined. The compiler is then free to perform further optimisations; in the first two cases it knows the dynamic type of this and so can generate a non-virtual call for doSomething. (I'd then expect it to inline those calls.)

Have a look at the generated code and see for yourself.

Alan Stokes
  • 18,815
  • 3
  • 45
  • 64
0

The optimisation is a problem of the compiler not of the standard. It would be a major bug if an optimisation was leading to a non respect or the princips of virtual functions.

So in the 3rd case :

// case 3
A& b_ref = b; // ref semantics   
b_ref.templateMethod();

the actual object is a B, and the actual function called must be the one defined in B class, whatever the reference of pointer used is.

And my compiler displays correctly 43 - has it displayed anything else I would have changed compiler immediately ...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • My point is that in case 1 and 2, the compiler knows at compile time the object and therefore does not need to generate a virtual function object code. Of course when it doesn't know like case 3, it should respect the virtual call mechanism. However, in order to do the three cases with both the right behaviour and a complete optimisation, it needs to generate 3 different templateMethod() object code. – Bérenger Dec 18 '14 at 16:02
  • @Bérenger It doesn't need to generate 3 versions of `templateMethod`. It inlines it, i.e. replaces the call with the function body, in three different places, and then optimises the resulting code as part of compiling `main`. – Alan Stokes Dec 18 '14 at 16:30
  • @Bérenger : I admit that I considere that it is the problem of compiler developpers, not mine. I tried to analyse optimised code and soon gave up : it was quicker than unoptimised one but I hardly found what I had written in source. Now I only try to do low level optimisation if I have performance problem, and only after identifying the bottleneck. – Serge Ballesta Dec 18 '14 at 16:31
  • @SergeBallesta I agree, but it is always good to know when the compiler is smart enough or not, so you don't try to optimize later on something which was optimized behind the scenes from the beginning. – Bérenger Dec 18 '14 at 19:01
  • @AlanStokes I am concerned with the inlining of doSomething(), not templateMethod(). Suppose template method is recursive depending on a run-time condition: it won't be inlined. But will doSomething() be inlined ? If yes, it implies 3 different object code for templateMethod() – Bérenger Dec 18 '14 at 19:04
  • No; it implies that only in the sense that when a function is inlined the code generated for the function is customised for that call site. Read my answer and my comment above. (Compilers will happily inline recursive functions, btw - just not to infinite depth.) You clearly find your existing answers unsatisfactory; perhaps you should clarify what you are actually asking. – Alan Stokes Dec 18 '14 at 20:45
  • @AlanStokes "the code generated for the function is customised for that call site". Yes, but suppose that for whatever reason, templateMethod() is not inlined. Then if the compiler is naive it will generate code for this method at ONE site. Then it will fail to inline doSomething() at this site since it really needs to inline 2 different codes: A::doSomething() and B::doSomething(). – Bérenger Dec 19 '14 at 04:29
  • @AlanStokes Btw your answer is interesting, but not completely addressing the problem, hence my comments. – Bérenger Dec 19 '14 at 04:32