6

I am working on an embedded platform which doesn't cope very well with dynamic code (no speculative / OOO execution at all). On this platform I call a virtual member function on the same object quite often, however the compiler fails to optimize the vtable-lookup away, as it doesn't seem to recognize the lookup is only required for the first invocation.

Therefore I wonder: Is there a manual way to devirtualize a virtual member function of a C++ class in order to get a function-pointer which points directly to the resolved address? I had a look at C++ function pointers, but since they seem to require a type specified, I guess this won`t work out.

Thank you in advance

user2923748
  • 315
  • 2
  • 9
  • possible duplicate of [Hoisting the dynamic type out of a loop (a.k.a. doing Java the C++ way)](http://stackoverflow.com/questions/7451442/hoisting-the-dynamic-type-out-of-a-loop-a-k-a-doing-java-the-c-way) – Kerrek SB Dec 11 '13 at 13:45
  • Thanks for pointing out this could be a duplicate, however the question in the other thread is not answered :/ – user2923748 Dec 11 '13 at 14:03
  • When you say "it doesn't seem to recognize the lookup is only required for the first invocation" do you mean that, within a single function, the base class pointer is guaranteed to always point to the same instance? (Eg. VBase *b = new Subclass(); b->virtualfunc();...). – Ben Jaguar Marshall Dec 11 '13 at 22:49

3 Answers3

5

There's no general standard-C++-only way to find the address of a virtual function, given only a reference to a base class object. Furthermore there's no reasonable type for that, because the this needs not be passed as an ordinary argument, following a general convention (e.g. it can be passed in a register, with the other args on stack).

If you do not need portability, however, you can always do whatever works for your given compiler. E.g., with Microsoft's COM (I know, that's not your platform) there is a known memory layout with vtable pointers, so as to access the functionality from C.

If you do need portability then I suggest to design in the optimization. For example, instead of

class Foo_base
{
public:
    virtual void bar() = 0;
};

do like

class Foo_base
{
public:
    typedef (*Bar_func)(Foo_base&);

    virtual Bar_func bar_func() const = 0;
    void bar() { bar_func()( *this ); }
};

supporting the same public interface as before, but now exposing the innards, so to speak, thus allowing manual optimization of repeated calls to bar.

Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
1

Regarding gcc I have seen the following while debuggging the assembly code compiled. I have seen that a generic method pointer holds two data: a) a "pointer" to the method b) an offset to add eventually to the class instance starting address ( the offset is used when multiple inheritance is involved and for methods of the second and further parent class that if applied to their objects would have their data at different starting points).

The "pointer" to the method is as follows: 1) if the "pointer" is even it is interpreted as a normal (non virtual) function pointer. 2) If the "pointer" is odd then 1 should be subtracted and the remaining value should be 0 or 4 or 8 or 12 ( supposing a pointer size of 4 bytes). The previous codification supposes obviously that all normal methods start at even addresses (so the compiler should align them at even addresses). So that offset is the offset into the vtable where to fetch the address of the "real" non virual method pointer.

So the correct idea in order to devirtualize the call is to convert a virtual method pointer to a non virtual method pointer and use it aftewards in order to apply it to the "subject" that is our class instance.

The code bellow does what described.

#include <stdio.h>
#include <string.h>
#include <typeinfo>
#include <typeindex>
#include <cstdint>

struct Animal{
    int weight=0x11111111;
    virtual int mm(){printf("Animal1 mm\n");return 0x77;};
    virtual int nn(){printf("Animal1 nn\n");return 0x99;};
};

struct Tiger:Animal{
    int weight=0x22222222,height=0x33333333; 
    virtual int mm(){printf("Tigerxx\n");return 0xCC;}
    virtual int nn(){printf("Tigerxx\n");return 0x99;};
};

typedef int (Animal::*methodPointerT)();

typedef struct {
    void** functionPtr;
    size_t offset;
} MP;

void devirtualize(methodPointerT& mp0,const Animal& a){
    MP& t=*(MP*)&mp0;
    if((intptr_t)t.functionPtr & 1){
        size_t index=(t.functionPtr-(void**)1); // there is obviously a more
        void** vTable=(void**)(*(void**)&a);        // efficient way. Just for clearness !
        t.functionPtr=(void**)vTable[index];
    }
};

int main()  
{
    int (Animal::*mp1)()=&Animal::nn;
    MP& mp1MP=*(MP*)&mp1;

  Animal x;Tiger y;

    (x.*mp1)();(y.*mp1)();

    devirtualize(mp1,x);

    (x.*mp1)();(y.*mp1)();

}
George Kourtis
  • 2,381
  • 3
  • 18
  • 28
0

Yes, this is possible in a way that works at least with MSVC, GCC and Clang.

I was also looking for how to do this, and here is a blog post I found that explains it in detail: https://medium.com/@calebleak/fast-virtual-functions-hacking-the-vtable-for-fun-and-profit-25c36409c5e0

Taking the code from there, in short, this is what you need to do. This function works for all objects:

template <typename T>
void** GetVTable(T* obj) {
  return *((void***)obj);
}

And then to get a direct function pointer to the first virtual function of the class, you do this:

typedef void(VoidMemberFn)(void*);

VoidMemberFn* fn = (VoidMemberFn*)GetVTable<BaseType>(my_obj_ptr)[0];

// ... sometime later

fn(my_obj_ptr);

So it's quite easy actually.

JohnAl
  • 1,064
  • 2
  • 10
  • 18
  • It may have worked with those three compilers on one architecture where you tested, but it will surely fail on some other architectures. – Ben Voigt Feb 02 '22 at 19:42
  • To my knowledge, it is a stable convention followed by at least MSVC, GCC, and Clang that the VTable is set up this way. Could you explain why you say a specific architecture could make it fail? – JohnAl Feb 02 '22 at 20:13
  • Did you read the first paragraph of the accepted answer before adding your own? I can vouch for the fact that Alf is correct about a difference in calling convention on some architectures between a non-static member function and an ordinary function. You just can't call a `thiscall` calling convention function through an ordinary function pointer on that very popular architecture, as the "this" pointer won't be placed in the correct register. – Ben Voigt Feb 02 '22 at 20:36