-1

This has come up before, but the questions are slightly different, and the answers were all quite unhelpful, so I'll try one more time.

I need 2 pieces information from the compiler which seem to be difficult to extract.

  1. I want to find the vtable pointer for a given class without an instance of the class. The only reference I can find to the vtable symbol anywhere in the binary is in the constructor where it's assigned to new instances, and it's really awkward to get the pointer out of the constructor without calling it... I'm wondering if anyone can point to the vtable name mangling spec for common compilers (msvc, gcc, clang) so I can extern to the symbol explicitly? (I haven't been able to find this). My concern with this is that I suspect (at least on VC) that the symbol name has some characters that are illegal in C++ identifiers, so I'm not sure how to create a variable that links to it...

  2. I need the actual function pointers for methods. The only syntax that seems to be available to approach member function pointers is the pointer-to-member operator, and that results in very compiler-specific output.

    I have observed, GCC/Clang produce a nice little struct; { void *ptr_or_offset; size_t suspected_vtable; }. From that, it's easy enough to find the actual function pointer (assuming I have the vtable pointer! see #1).

    MSVC is a little harder; pointer-to-members for virtuals are pointers to thunk functions that perform the virtual lookup. It seems the thunk's correlate with the vtable offset, so there is one thunk for each vtable offset. This strategy makes it very hard to; identify if the method is virtual or not, and if it is, get the vtable offset (and therefore the actual function pointer). I'm thinking maybe I can fabricate a table of thunk pointers for each vtable offset up to some N, then when I take a pointer-to-member, I can compare it with each item in the thunk table; if it is among them, I know it is virtual, and the vtable offset, so I can get the pointer.

So, this all sounds horrible, but it is what it is since C++ doesn't feel like syntax should be available to get these fundamental language primitives, and doesn't have proper delegates for some unknown reason!

Can anyone think of a better or more-direct mechanism to capture those pieces of data that I seek? Or any alternative solutions which would improve on portability would also be cool!

Cheers!

Edit: Considering there are other posts similar to this already filled with people saying that 'it's not portable', and 'don't do it', I'd like to request that you refrain from polluting this thread with more of the same. They are worthless comments that don't address the problem. This problem requires some creative thinking, impress us with the quality of your solution.

Edit 2: Not sure why I'm being down-voted. This is an interesting and largely unsolved problem. There is very little topical discussion on the internets.

Manu Evans
  • 1,088
  • 2
  • 9
  • 25
  • 2
    I guess you are into code that is not portable and may break with the new version of a compiler – Ed Heal Mar 20 '16 at 14:09
  • 1
    It does indeed sound horrible. In other words: Do NOT PURSUE this. To call a member function, you NEED an instance. There are `std::function` and `std::bind` that will help in C++11. – Mats Petersson Mar 20 '16 at 14:29
  • 1
    Oh, and if you still insist on doing this, I'd suggest you make sure your personal identity (email address, username, etc) can not lead in any way shape or form to the author identity of such code, since potential employers have been known to look up what people have posted online as part of their references. – Mats Petersson Mar 20 '16 at 14:31
  • I am, and remain extremely employable. Thank you for your concern though. – Manu Evans Mar 20 '16 at 14:51
  • You'll also notice that I don't mention *calling* functions anywhere in this post. It's not part of the problem. std::function/std::bind don't solve either of the problems I posted. – Manu Evans Mar 20 '16 at 14:54
  • At the very least, give some sort of explanation *why* you think you need it. Assuming this isn't just for fun, what's the bigger problem you're trying to solve that made you think this would be a good idea? –  Mar 20 '16 at 14:57
  • I have a scenario where it would be very useful and wickedly efficient to be able to build dynamic types by fabricating vtables to dynamically derive C++ classes at runtime. It's extremely application specific, and portability is not important in my domain. That siad, I don't want discussion of relevance to distract from the 2 actual problems. They are interesting problems in their own right, and they're not really solved. I can (and currently do) work-around these issues with various shims and glue, but it all comes at cost of numerous extra indirections and some dynamic allocations. – Manu Evans Mar 20 '16 at 15:05
  • You are diving into *implementation details*. The simple existence of a vTable is already an implementation detail even if all current compilers do use that. If you really need that my best advice would be to dive into source code for CLang which is more recent than gcc and so probably simpler to understand. – Serge Ballesta Mar 20 '16 at 15:06
  • Your comment may say that portability is not important, but your question says you want it to be portable to three different compilers that perform different kinds of optimisations. And dynamic classes is another solution looking for a problem. It's something C++ doesn't have, it's something I'm pretty sure you already know C++ doesn't have. You probably have some reasons for wanting that. You may get useful alternatives if you share those reasons, but right now, you won't get anything you'll consider useful here. –  Mar 20 '16 at 15:11
  • Yes, I'm specifically requesting links to *implementation details*, I'm not sure why this upsets people so much. I haven't been able to find definitive information about vtable symbol names for instance. I'm also looking for clever tricks that may coerce the compiler to part with some of it's internal detail. When the constants I'm chasing appear in the instruction stream, not so helpful. I'd like to coerce the compiler to produce the magic numbers/pointers in data blocks. Clang seems to be fairly simple, I think I can solve for GCC/Clang, VC is the most trouble, as usual. – Manu Evans Mar 20 '16 at 15:13
  • @hvd obviously a more portable solution is preferable; the code volume will likely be shorter, perhaps less `#ifdef`'s, potentially simpler to maintain. I'm not sure why an interesting problem can't invoke some thought and interesting solutions in its own right? – Manu Evans Mar 20 '16 at 15:15
  • 1. You can't. 2. You can't. If you need a function with a given signature, because your third-party APIs require it ot something, **just write one**. Anything you find by looking at your compiler-specific output may or may not be callable by other software. – n. m. could be an AI Mar 21 '16 at 11:04
  • I've just noticed you mention "proper delegates". I have no idea what peoper delegates are, but you just might have a case of XY-problem. What is the real problem you are trying to solve? – n. m. could be an AI Mar 21 '16 at 12:51

3 Answers3

0

Virtual tables are not at all part of the c++ standard at all. Not a word about it in the whole ISO standard. It's only a commonly used implementation practice. This is why you'll find not tool support to do what you want.

Furthermore, virtual tables can be a very complex matter, for example in case of multiple inheritance, where several different virtual tables are used in the same object (for the different subobject).

You have found an astute way of get some info out, and have apparently already studied this in detail. However, I'm not sure that your way of finding out the vtable offset will work in all the different conditions. I'd strongly advise you to think over your problem and fin another way to implement it (unless you're working on new debugging tools and have no choice ;-) )

Christophe
  • 68,716
  • 7
  • 72
  • 138
0

This is just a hint, and unsure whether it is relevant for you actual problem.

If you need to get access the the underlying vTables (which are only implementation details), you should think about explicit vTable management. That mean that all your dynamic classes should have no virtual functions, but only a pointer to an explicit vtable containing pointers to functions (not member functions) whose first parameter should be a pointer (or ref) to an object of the class.

You should then explicitely use a invoke function or macro that would at a moment calls something like : inner_invocation(object, offset_of_method, other_args...). Once you are there it becomes possible to call by hand the actual virtual member function.

Ok, it really looks like C++ in C but at least it allows to master the operation without relying too much on compilers implementation details.


After reading again your question, and your comments, here is a simple way to do dynamic derivation. That's what Java programmers call proxying: a proxy (Java sense) is a thin object that implements an interface (class containing only pure virtual methods) over a real object. Ok, Java provides all the machinery to create proxy and C++ is not so kind but is extensible enough to provide ways do build somethink like it.

You should simply forget about vtables and see them as interfaces - BTW you gain portability.

Here is an example showing how to implement two interfaces on a class:

#include <iostream>
#include <string>
#include <sstream>

// definition of template class proxy
template<class C, class ... I>
class Proxy: public C, public I... {
private:
    Proxy() {};
public:
    Proxy(const C& c): C(c) {};   // uses copy constructor on original object
    virtual ~Proxy() {};
};

/* if the proxy is not required to inherit from C, you can take a ref    
template<class C, class ... I>
class Proxy: public C, public I... {
private:
    C& obj;
    Proxy() {};
public:
    Proxy(const C& c): obj(c) {};   // just copy the reference
    virtual ~Proxy() {};
};
*/
// macro definition to help in proxy declarations
#define BEGIN_DECLARE_PROXY(proxy, cls, ...) \
class proxy: public Proxy<cls, __VA_ARGS__> { \
public: \
    proxy(const C& c): Proxy(c) {}; \

#define IMPLEMENT0(type, method, function) \
    type method() { \
        return function(*this ); \
    }
#define IMPLEMENT1(type, method, function, typ1, arg1) \
    type method(typ1 arg1) { \
        return function(*this, arg1 ); \
    }
#define IMPLEMENT2(type, method, function, typ1, arg1, typ2, arg2) \
    type method(typ1 arg1, typ2 arg2) { \
        return function(*this, arg1, arg2); \
    }
#define IMPLEMENT3(type, method, function, typ1, arg1, typ2, arg2, typ3, arg3) \
    type method(typ1 arg1, typ2 arg2, typ3, arg3) { \
        return function(*this, arg1, arg2, arg3); \
    }
// could add many others - could not use VA_ARGS here ...
/* if we used the ref. version, implementation should be return function(obj, ...); */

#define END_DECLARE_PROXY };

// example actual class
struct C {
    int val;
    std::string name;
};

// example interfaces
class I {
public:
    virtual std::string display() = 0;
};

class I2 {
public:
    virtual void show(std::ostream& out) = 0;
};

// function implementing the interfaces
std::string C_display(const C& c) {
    std::stringstream ss;
    ss << c.name << " (" << c.val << ")";
    return ss.str();
}

void C_show(const C& c, std::ostream& out) {
    out << C_display(c) << std::endl;
}

// actual proxy definition
BEGIN_DECLARE_PROXY(CI, C, I, I2)
IMPLEMENT0(std::string, display, C_display)
IMPLEMENT1(void, show, C_show, std::ostream&, out)
END_DECLARE_PROXY

int main() {
    C c = {12, "Foo"}; // create an object
    CI ci(c);   // build a proxy around the object
    I& i = ci;  // an interface on the proxy

    // example calls
    std::cout << i.display() << std::endl;
    ci.show(std::cout);

    return 0;
}
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Indeed, you've just described C++ as implemented by all major compilers. While I've considered that approach, it presents a heavy user-facing toll. By approaching it from the other direction (working withing C++'s definition), I may need to maintain a couple of compiler-specific back-end black-boxes, but they are more-or-less black-boxes which don't present excessive burden on front-end users. For this application, I'm happier with that tradeoff... if I can just capture the data. – Manu Evans Mar 20 '16 at 15:33
  • @TurkeyMan: Unsure whether it is relevant for your real use case, but here is an example of *dynamic derivation* – Serge Ballesta Mar 21 '16 at 10:54
0

I've been working on this, and I've worked it into a lib: https://github.com/TurkeyMan/virtualwrangler

It's 99% done, I just need a GCC implementation for:

template <typename C>
inline VTable GetVTable();

Can anyone think of any creative ways to coerce GCC into producing the vtable ptr for some class? The best I have is:

extern "C" void *_ZTV7MyClass;

Which would be fine, except the symbol name has the number '7' in there, which is the string length of the name of the class!! Which means I can't work that declaration into a macro... unless there's some clever way to perform a preprocessor stringlen...?

Ideas?

Manu Evans
  • 1,088
  • 2
  • 9
  • 25