Why are virtual thunks necessary?

Question

This question is about a (possible) implementation for virtual function calls (which I believe it is used by gcc).

Consider the following scenarios:

class F inherits from class D (and maybe others) which inherits from class B (not virtually). D overrides the virtual method f() declared in B; an object of type F is instantiated
class F inherits from class D (and maybe others) which inherits from class B (virtually). D overrides the virtual method f() declared in B; an object of type F is instantiated

(the only difference between these two scenarios is the way class B is inherited)

In scenario 1, in the vtable of object B, at the location destined for f() there is now a (non-virtual) thunk that says:

if you want to call f(), first change the this pointer with offset

(it is actually D that puts this thunk there)

In scenario 2, in the vtable of object B, at the location destined for f() there is now a (virtual) thunk that says:

if you want to call f(), first change the this pointer with the value stored at addr

(D cannot tell B exactly how much this pointer needs to be adjusted because it does not know the position of the B object in the final memory layout of the F object)

These assumptions were made by looking at the output of g++ -fdump-class-hierarchy in combination with g++ -S. Are they correct?

Now my question is: why is a virtual thunk necessary? Why can't F put a non-virtual thunk in B's virtual table (at the location for f())? Because when an F object needs to be instantiated, the compiler knows that f() was declared in B, but it was overridden in D. And it also knows the exact offset between the object B (-in-F) and the object D (-in-F) (which I think is the reason for the virtual thunk in the first place).

EDIT (added output of g++ -fdump-class-hierarchy and g++ -S)

Scenario 1:

g++ -fdump-class-hierarchy:

Vtable for F

...

48 (int (*)(...))D::_ZThn8_N1D1fEv (de-mangled: non-virtual thunk to D::f())

g++ -S:

_ZThn8_N1D1fEv:

.LFB16:

.cfi_startproc

subq $8, %rdi #,

jmp .LTHUNK0 #

.cfi_endproc

Scenario 2:

g++ -fdump-class-hierarchy:

Vtable for F

...

64 (int (*)(...))D::_ZTv0_n24_N1D1fEv (de-mangled: virtual thunk to D::f())

g++ -S:

_ZTv0_n24_N1D1fEv:

.LFB16:

.cfi_startproc

movq (%rdi), %r10 #,

addq -24(%r10), %rdi #,

jmp .LTHUNK0 #

.cfi_endproc

Can you illustrate your question with a concise code example and the output of `g++ -fdump-class-hierarchy` and `g++ -S` please. It's hard to get what you mean from your prose. Also note that a _vtable_ isn't a c++ standard concept, but a compiler specific implementation detail. — πάντα ῥεῖ, Jun 06 '17 at 18:53
Modern compilers are pretty good at devirtualizing functions these days btw. But sometimes they just cannot if there are external functions beyond their control that depend on virtuality. — Jesper Juhl, Jun 06 '17 at 19:23
@πάνταῥεῖ I would hope with can do **without** the "_vtable is not a C++ programming language specification concept_" boilerplate when questions include three different **implementation specific tags** (g++, vtable and thunk are not C++ concepts). — curiousguy, Jun 10 '17 at 21:31
@JesperJuhl The question implicitly assumes that the real type is kept away from the compiler, as with `template T *volatile_bleach(T *volatile p) { return p; } ` — curiousguy, Jun 10 '17 at 22:07

score 5 · Answer 1 · edited Aug 25 '17 at 12:16

I think I found the answer here:

"...There are several possible implementations of the thunks given the above information. Note in the following that we assume that prior to calling any vtable entry, the this pointer has been adjusted to point to the subobject corresponding to the vtable from which the vptr is fetched.

A. Since the offsets are always known at compile time, even for virtual bases, each thunk could be distinct, adding the known offset to this and branching to the target function. This would result in a thunk for each overrider at a distinct offset. As a result, a branch mispredict and possibly an instruction cache miss would occur each time the actual type changed for a reference at any given point in the code.

B. In the case of virtual inheritance, the offset, although known when the overrider is declared, may differ depending on derivations from the overrider's class. H and I above are the simplest example. H is a primary base for I, but the int member of I means that A is at a different offset from H in I than it was from a standalone H. Because of this, the ABI specifies that the secondary vtable for a virtual base A contain a vcall offset to H, so that a shared thunk can load the vcall offset, adding it to this, and branch to the target function H::f. This would result in fewer thunks, since for a inheritance hierarchy where A is a virtual base of H, and H::f overrides A::f, all instances of H in a larger hierarchy can use the same thunk. As a result, these thunks will cause fewer branch mispredictions and instruction cache misses. The tradeoff is that they must do a load before the offset add. Since the offset is smaller than the code for a thunk, the load should miss in cache less frequently, so better cache miss behavior should produce better results in spite of the 2 or more cycles required for the vcall offset load...."

It seems that the virtual thunk exists only for performance reasons. If I am wrong, please correct me.

Why are virtual thunks necessary?

1 Answers1