14

Suppose I have

class A           { public: void print(){cout<<"A"; }};
class B: public A { public: void print(){cout<<"B"; }};
class C: public A {                                  };

How is inheritance implemented at the memory level?

Does C copy print() code to itself or does it have a pointer to the it that points somewhere in A part of the code?

How does the same thing happen when we override the previous definition, for example in B (at the memory level)?

Moeb
  • 10,527
  • 31
  • 84
  • 110

5 Answers5

10

Compilers are allowed to implement this however they choose. But they generally follow CFront's old implementation.

For classes/objects without inheritance

Consider:

#include <iostream>

class A {
    void foo()
    {
        std::cout << "foo\n";
    }

    static int bar()
    {
        return 42;
    }
};

A a;
a.foo();
A::bar();

The compiler changes those last three lines into something similar to:

struct A a = <compiler-generated constructor>;
A_foo(a); // the "a" parameter is the "this" pointer, there are not objects as far as
          // assembly code is concerned, instead member functions (i.e., methods) are
          // simply functions that take a hidden this pointer

A_bar();  // since bar() is static, there is no need to pass the this pointer

Once upon a time I would have guessed that this was handled with pointers-to-functions in each A object created. However, that approach would mean that every A object would contain identical information (pointer to the same function) which would waste a lot of space. It's easy enough for the compiler to take care of these details.

For classes/objects with non-virtual inheritance

Of course, that wasn't really what you asked. But we can extend this to inheritance, and it's what you'd expect:

class B : public A {
    void blarg()
    {
        // who knows, something goes here
    }

    int bar()
    {
        return 5;
    }
};

B b;
b.blarg();
b.foo();
b.bar();

The compiler turns the last four lines into something like:

struct B b = <compiler-generated constructor>
B_blarg(b);
A_foo(b.A_portion_of_object);
B_bar(b);

Notes on virtual methods

Things get a little trickier when you talk about virtual methods. In that case, each class gets a class-specific array of pointers-to-functions, one such pointer for each virtual function. This array is called the vtable ("virtual table"), and each object created has a pointer to the relevant vtable. Calls to virtual functions are resolved by looking up the correct function to call in the vtable.

Max Lybbert
  • 19,717
  • 4
  • 46
  • 69
  • +1 for bringing name mangling and the compiling process into discussion. Besides getting more complex (not necessarily through a vtable) with virtual methods, it gets even messier with multiple and virtual inheritance... probably out of the scope of the question. – David Rodríguez - dribeas Apr 21 '10 at 08:03
  • Yeah, I didn't want to get into the details of multiple inheritance or virtual inheritance. – Max Lybbert Apr 21 '10 at 08:40
  • 1
    "_each object created gets an extra hidden table_" no, each **object** created gets **a pointer to the vtable** (virtual table), called the vptr. Each dynamic class gets one or more vtables. "_the vtable, which is an array of pointers-to-functions_" no, it is more like a struct with different informations (many pointers to function (sometimes with an offset), a pointer to a char array, sometimes many offsets, pointer to other data structures...) – curiousguy Aug 05 '12 at 06:32
3

Check out the C++ ABI for any questions regarding the in-memory layout of things. It's labelled "Itanium C++ ABI", but it's become the standard ABI for C++ implemented by most compilers.

DevSolar
  • 67,862
  • 21
  • 134
  • 209
3

I don't think the standard makes any guarantees. Compilers can choose to make multiple copies of functions, combine copies that happen to access the same memory offsets on totally different types, etc. Inlining is just one of the more obvious cases of this.

But most compilers will not generate a copy of the code for A::print to use when called through a C instance. There may be a pointer to A in the compiler's internal symbol table for C, but at runtime you're most likely going to see that:

A a; C c; a.print(); c.print();

has turned into something much along the lines of:

A a;
C c;
ECX = &a; /* set up 'this' pointer */
call A::print; 
ECX = up_cast<A*>(&c); /* set up 'this' pointer */
call A::print;

with both call instructions jumping to the exact same address in code memory.

Of course, since you've asked the compiler to inline A::print, the code will most likely be copied to every call site (but since it replaces the call A::print, it's not actually adding much to the program size).

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
1

There will not be any information stored in a object to describe a member function.

aobject.print();
bobject.print();
cobject.print();

The compiler will just convert the above statements to direct call to function print, essentially nothing is stored in a object.

pseudo assembly instruction will be like below

00B5A2C3   call        print(006de180)

Since print is member function you would have an additional parameter; this pointer. That will be passes as just every other argument to the function.

yesraaj
  • 46,370
  • 69
  • 194
  • 251
  • @yesraaj: I was referring to the class code. Does `C class` copy `print()` definition to itself or just uses a pointer to `print()` in `A class`? – Moeb Apr 21 '10 at 05:01
  • @cambr I don't think class will be stored as it is any where. Any way lets wait for few more answers – yesraaj Apr 21 '10 at 05:08
1

In your example here, there's no copying of anything. Generally an object doesn't know what class it's in at runtime -- what happens is, when the program is compiled, the compiler says "hey, this variable is of type C, let's see if there's a C::print(). No, ok, how about A::print()? Yes? Ok, call that!"

Virtual methods work differently, in that pointers to the right functions are stored in a "vtable"* referenced in the object. That still doesn't matter if you're working directly with a C, cause it still follows the steps above. But for pointers, it might say like "Oh, C::print()? The address is the first entry in the vtable." and the compiler inserts instructions to grab that address at runtime and call to it.

* Technically, this is not required to be true. I'm pretty sure you won't find any mention in the standard of "vtables"; it's by definition implementation-specific. It just happens to be the method the first C++ compilers used, and happens to work better all-around than other methods, so it's the one nearly every C++ compiler in existence uses.

cHao
  • 84,970
  • 20
  • 145
  • 172
  • "_Generally an object doesn't know what class it's in at runtime_" an instance of a dynamic class certainly knows its type at runtime! – curiousguy Aug 05 '12 at 06:41
  • @curiousguy: In C++, classes *don't even exist* at runtime. It's all just bytes, pointers and smoke. And what are these "dynamic classes" you speak of, anyway? – cHao Aug 05 '12 at 10:16
  • "_classes don't even exist at runtime_" No true, `typeid(*this).name()` certainly does exist at runtime "_And what are these "dynamic classes" you speak of, anyway?_" classes with a vptr (either for virtual functions or for virtual base classes) – curiousguy Aug 05 '12 at 13:30
  • @curiousguy: Yeah, typeid's can exist at runtime. Classes, however, don't. The first C++ compiler started as a translator that turned C++ into C -- which, of course, doesn't contain any built-in notion of "classes". Objects became structs, member functions became global functions with slightly mangled names, and virtual functions became vtable entries. To a huge degree, though they now generate machine code directly, they *still* works this way -- CPUs rarely know or care about "classes" either. Like i said, it's all bytes and pointers...plus a little sleight of hand and misdirection. – cHao Aug 05 '12 at 16:31
  • As for how RTTI works when the classes no longer actually exist, i presume there's a pointer to a preexisting (compiled-in) type_info in the vtable (probably the first entry). Seems the simplest thing that'd work. – cHao Aug 05 '12 at 17:08
  • "_The first C++ compiler started as a translator that turned C++ into C_" yes, the cfront compiler - I know this stuff. "_which, of course, doesn't contain any built-in notion of "classes"._" of course. Most compilers compile to assembly, which don't have a notion of class either. That is the **whole point of the compiler, or translator: to translate a programming language into another, simpler, lower level, with less concepts**. C and assembly do not have any concept of vptr, of vtable either. And assembly does not know about C or C++ objects... obviously. – curiousguy Aug 05 '12 at 18:43
  • When you are saying that classes do not exist at runtime, are you also saying that objects do not exist at runtime? That anything that doesn't exist in the C language doesn't exist at runtime? It now seems to me that your previous statement "Classes don't exist at runtime" is empty (it says nothing). It is inherently tautological/trivial: you could replace "class" with anything, and even replace "C++" by any other language. If you can make the statement not inherently tautological, i.e. if you can name some **C++ concept that still exists at runtime**, please enlighten me! – curiousguy Aug 05 '12 at 18:48
  • @curiousguy: ***THAT'S THE FREAKING POINT!*** Classes, objects, etc (anything other than bytes, certain-sized integers, pointers, floating-point numbers, and functions) are at a higher level of abstraction, and needs to be deconstructed into native things in order to translate it. And my point is that in the case of C++, any part of that abstraction that's not absolutely needed (like, say, anything saying `A::function` only works with `A`s) is simply **tossed out and not translated at all**. RTTI is basically a pointer to a name, nothing more. Its only use is comparison with other names. – cHao Aug 05 '12 at 19:45
  • "Classes don't exist at runtime" is not tautological; in languages like C# and Java, a "class" is an actual thing -- it's there, it can be gotten, looked at, and even used to create new objects at will. You're basically *running* at a higher level of abstraction in those languages; the runtime's job there is to translate the higher-level stuff to lower-level stuff on the fly. You don't have that with C++ -- that higher-level stuff is tossed aside *at compile time*, and the runtime doesn't really translate anything -- it just calls a few functions and provides you some other functions. – cHao Aug 05 '12 at 20:16
  • You know, the JVM is not everything; Java can be compiled to assembly too. The dynamic type of an object can be "gotten, looked at" in C++ too with `typeid`. "_that higher-level stuff is tossed aside at compile time_" again, that is an empty statement "_the runtime doesn't really translate anything_" what about dynamic link? lazy dynamic link? explicit module loading (`dlopen`, `dlclose`, `dlsym`)? – curiousguy Aug 05 '12 at 23:51
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/14928/discussion-between-curiousguy-and-chao) – curiousguy Aug 05 '12 at 23:56