26

Look at the following C++ code

class Base1 {  
public:  
    Base1();  
    virtual ~Base1();  
    virtual void speakClearly();  
    virtual Base1 *clone() const;  
protected:  
    float data_Base1;  
};  

class Base2 {  
public:  
    Base2();  
    virtual ~Base2();  
    virtual void mumble();  
    virtual Base2 *clone() const;  
protected:  
    float data_Base2;  
};  

class Derived : public Base1, public Base2 {  
public:  
    Derived();  
    virtual ~Derived();  
    virtual Derived *clone() const;  
protected:  
    float data_Derived;  
}; 

The 《Inside of C++ Object Model 》4.2 says that the virtual table layout of class Base1,Base2 and Derived is like this: enter image description here

enter image description here

My question is :

The virtual table of the Base1 subObject of class Derived contains Base2::mumble.Why?I know Derived class shared this virtual table with Base1,so I think the function of Base2 should not appear here.Could someone tell me why? Thx.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
XiaJun
  • 1,855
  • 4
  • 24
  • 41
  • It doesn't hurt to add additional entries to the `Derived` vtable after those of `Base1`. It could be done for efficiency. Given a `Derived*` pointer, it is cheaper to call virtual ffunctions via the `Base1`/`Derived` vtable than via the `Base2` vtable. – n. m. could be an AI Apr 10 '13 at 09:12
  • 1
    Note: the way things are presented seems screwed up, in the Itanium ABI the `_vptr` member is actually **first**; and likewise `Base1` is the **first** member of `Derived`. – Matthieu M. Apr 10 '13 at 09:12
  • 2
    @MatthieuM. In all of the compilers I've seen, the `_vptr` (pseudo-)member has been the first thing in the class, but the standard obviously allows it anywhere. – James Kanze Apr 10 '13 at 09:19
  • 1
    @JamesKanze: Yes, which is why I precised the ABI I was talking about (which is also the only ABI I am slightly acquainted with); but it seems to me that for the optimization to be relevant, you want to make finding the address of `Base1` (and thus the `_vptr`) as easily as possible; ideally with no arithmetic involved. – Matthieu M. Apr 10 '13 at 09:28
  • @MatthieuM. You'd like to make finding the address of all of the `_vptr` as easy as possible:-). Seriously, for Intel, there is an argument for putting the `_vptr` at the end of the first base class, and maintaining a pointer to _it_ as the pointer to object. (There's no rule, or at least there didn't used to be, that a `Derived*` must point to the first byte of the object.) Intel has an addressing mode where the offset to the base pointer may be a single byte, in the range of `-128...127`. Putting the pointer to the object into the middle of the object means that you can use this more. – James Kanze Apr 10 '13 at 09:37
  • @JamesKanze There was this old Metrowerks Codewarrior C/C++ for MacOS 7 with the vptr at the end. (Round 1996 I think.) – curiousguy Dec 25 '13 at 13:36

2 Answers2

6

Well, first of all, I'll remind everyone that the design of the solution to implement polymorphism is an ABI decision outside of the Standard. For example, MSVC and the Itanium ABI (followed by gcc, clang, icc, ...) have different ways to implement this.

With that out of the way, I think that this is an optimization for lookup.

Whenever you have a Derived object (or one of its descendant) and lookup the mumble member, you do not need to actually find out the Base2 subobject but can directly act from the Base1 subobject (whose address coincides with Derived subobject, so no arithmetic involved).

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 4
    More fundamentally, the compiler has added the functions of `Derived` behind the functions of `Base1`, so that the same `vtable` can be used for both. And `mumble` is also a function of `Derived` (by inheritance), not just of `Base2`. It's an "optimization", but it's an optimization based on the fundamentals of the language. I can't imagine any compiler not doing it. – James Kanze Apr 10 '13 at 09:16
  • Is it really a useful optimization? To _call_ `mumble`, you still need to find the `Base2` subobject for its `this` pointer, so you didn't save any work. The only possible improvement is if you frequently check the address of `mumble` without calling it, or if it improves the cache hit rate on the first subobject vtable. – Useless Apr 10 '13 at 09:18
  • 1
    @Useless: I must admit I am not sure it's that efficient; however it does remove a data-dependency. You can now compute the address of `mumble` and the address of the `Base2` subobject in parallel. Therefore, the CPU can compute `mumble`, start loading code from memory, and *in parallel* compute `Base2` subobject. – Matthieu M. Apr 10 '13 at 09:26
  • @Matthiew: Interesting point! Does C++ standard says anything about this? and what happens if it change the order like `class Derived : public Base2, public Base1`? – Nayana Adassuriya Apr 10 '13 at 09:28
  • @NayanaAdassuriya: No, as I said, the Standard is only concerned about *effect* and not *means*. If you intervert `Base2` and `Base1`, then the order they appear in the memory layout will change and `Base2`'s virtual-table will be extended with `Base1` methods (most likely). – Matthieu M. Apr 10 '13 at 09:34
  • 1
    I suppose it is simply a typo I guess that vtable for Derived is actually contigious with 2 segments representing Base1 and Base2 vtables, so to call any virtual function from the Derived viewpoint (if we know that the object is indeed Derived) one need the same pointer arithmetics (an offset from the beginning of the whole Derived vtable) I also find another typo: Derived::~clone() instead of Derived::clone() (just before the questioned line) – user396672 Apr 10 '13 at 09:34
  • @user396672: I am not sure you can have a single v-table, because the v-table begins with a typeinfo entry (always). – Matthieu M. Apr 10 '13 at 10:16
  • @Matthieu: it is of course a compiler/abi decision but technically seems that nothing prevent to put any number of typeinfo's among functions anywhere inside the whole Derived vtable (although I'm not sure too) – user396672 Apr 10 '13 at 11:51
  • @Matthieu: ...at least nothing prevent the compiler to put both Derived/Base1 and Derived/Base2 tables together into a contiguous space as the compiler is free to put them anywhere. Btw the picture contain another mistake: obviously Derived can't "share" vtable with Base1 since the tables have different content (probably "shared with Derived/Base1" was the intent). Honestly, I can't fully trust such kind of sources... – user396672 Apr 10 '13 at 12:27
  • 3
    @user396672: that is not a mistake. `Derived` and `Base1` have different v-tables, however the v-ptr in `Base1` can point to the table of `Derived` because the table `Base1` is a prefix of that of `Derived` (and `Base1` will only ever use the prefix it knows about). – Matthieu M. Apr 10 '13 at 14:51
  • @Matthieu: I agree it may be treated rather as terminology inconsistency(not a mistake): the second vtable is referred as Derived/Base2, but the first appears simply as Base1 which is double-meaning (Base1 original vtable or Derived/Base1 subobject's vtable) – user396672 Apr 10 '13 at 15:51
  • @Useless I think the vtable would contain `D::mumble`, not `Base2::mumble`. No pointer adjustment needed. – curiousguy Dec 24 '13 at 19:33
-2

At runtime when you get:

    Base2 b2;
    Base1* b1_ptr = (Base1*)&b2;
    b1_ptr->mumble();    // will call Base2::mumble(), this is the reason.

Then the Base2::mumble() needs to be invoked! Take care that mumble() is the ONLY virtual method that was overriden in hierarchy. (Even, You may think that clone() is overriden too but that returns different type among classes then it is another signature).