6

Imagine a project in which there is an interface class like the following:

struct Interface
{
    virtual void f()=0;
    virtual void g()=0;
    virtual void h()=0;
};

Suppose that somewhere else, someone wishes to create a class implementing this interface, for which f, g, h all do the same thing.

struct S : Interface
{
    virtual void f() {}
    virtual void g() {f();}
    virtual void h() {f();}
};

Then it would be a valid optimisation to generate a vtable for S whose entries are all pointers to S::f, thus saving a call to the wrapping functions g and h.

Printing the contents of the vtable, however, shows that this optimisation is not performed:

S s;
void **vtable = *(void***)(&s);  /* I'm sorry. */
for (int i = 0; i < 3; i++)
    std::cout << vtable[i] << '\n';

0x400940
0x400950
0x400970

Compiling with -O3 or -Os has no effect, as does switching between clang and gcc.

Why is this optimisation opportunity missed?

At the moment, these are the guesses that I have considered (and rejected):

  1. The vtable printing code actually prints garbage.
  2. The performance improvement is considered worthless.
  3. The ABI prohibits it.
PBS
  • 1,389
  • 11
  • 20
  • 8
    This optimization would be visible to conforming code, since the pointers to member functions would compare equal. – David Schwartz Sep 01 '15 at 18:45
  • It may seem naive, but for me, *function calling `f`* is not `f` itself. So it would be strange, if pointers to member functions would be equal, don't you think? – Mateusz Grzejek Sep 01 '15 at 18:47
  • 2
    @David Schwartz: llvm is known to be able to do things like, inline code through function pointers. presumably it can also inline the body of `f` in `g` and `h` also? That's not the same as the optimization proposed by the OP I guess... My guess is that this optimization just isn't very important in practice so nobody cares? It feels like the "pointer-to-member-function" comparing equal issue could be either detected by the compiler, or the standard could be modified to allow the optimization anyways the same way that copy ellision is permitted – Chris Beck Sep 01 '15 at 18:52
  • @MateuszGrzejek: The intention behind the code is *exactly* `f` is `g` is `h`, in which case having equal function pointers seems completely natural. But C++ does not have a way to express this, forcing the workaround. – PBS Sep 01 '15 at 19:07
  • I guess you mean that `&S::f` must be a different value than `&S::g`, right @DavidSchwartz? However, taking the address of a function may give a different address than the one used when calling the function, which is the one stored in the vtable, and that would allow this optimization. A possible way to get a unique address would be to add a preceding `NOP` opcode to the actual function code. However: If you write code where this optimization is worthwhile, you may have different performance issues, which is probably the reason you don't see this optimization. – Ulrich Eckhardt Sep 01 '15 at 19:14
  • @PBS No, that's not the case. `Interface` specifies, that there should be at least three functions: `f`, `g` and `h` to fully implement `Interface`. For some reason, *they were designed* to be three different functions with three different names. Now, you're making child class, in which all of them perform the very same action. Do you think it is a good design? Answer is: nope, unless *very* unusual situation arises. Also, I think such "optimization" would give no real improvement whatsoever, especially due to inlinig. – Mateusz Grzejek Sep 01 '15 at 19:16
  • @MateuszGrzejek: How is it bad design? An example: 'GameObject' with member functions `attack`, `collide`, `talk_to`. Then in 'Dynamite', the vtable entries should all point to `detonate`. – PBS Sep 01 '15 at 19:30
  • 1
    @PBS Ah, yes, another `GameObject` hierarchy. Ok, let's not do offtopic here. IMHO such optimization is not necessary. That's all from me. – Mateusz Grzejek Sep 01 '15 at 19:52
  • Having different addresses actually come in handy when you have to look up symbols from debug info. At the same time having a little bit of duplicated code isn't necessarily a problem (maybe code cache misses in pathological cases...). BTW, have you checked out the actual code found at different addresses? I haven't checked out anything but it may happen that those addresses contain only simple stub code with jmp to the actual code that is common between multiple virtual methods... It is worth checking this out. – pasztorpisti Sep 01 '15 at 22:36
  • @DavidSchwartz: No, it won't, at least not for the compiler considered here. GCC stores the vtable offset in the PMF. So `&Interface::f` is represented as `{0,1}` and `&Interface::g` as `{1,1}` even if the vtable itself holds duplicate entries. (The second value tells the runtime that the first field is a vtable entry, and that the vtable is found at offset 0) – MSalters Sep 01 '15 at 23:14

1 Answers1

2

Such optimization is not valid because...

// somewhere-in-another-galaxy.hpp
struct X : S {
    virtual void f();
};

// somewhere-in-another-galaxy.cpp
include <iostream>
void X::f() {
    std::cout << "Hi from a galaxy far, far away! ";
}

If a compiler implements your optimization this code would not work.

Interface* object = new X;
object->g();

A compiler of my translation unit does not know about your class internal implementation so for g() and h() it just puts in my class' virtual functions table references to the corresponding entries in your class' VFT.

Alan Milton
  • 374
  • 4
  • 13
  • The problem is not so much "your compiler does not know about my class-internal implementation" - even if `g()` is implemented in-header, it's a case of "other compilers aren't required to continue to make my compiler's optimisations valid" (which it could do by overriding `g` whenever it overrides `f`). – PBS Sep 26 '15 at 22:17
  • 1
    Another note: when wrapping the definition of `S` in an anonymous namespace and turning on `-O2`, gcc *does* merge all the vtable entries (thus lending credence to this answer). – PBS Sep 26 '15 at 22:20