(This is quite a large question about software design. In case it's not suited for StackOverflow I'm willing to copy it to the Software-Engineering community)
I'm working with heap_stat, a script, which investigates dumps. This script is based on the idea that, for any object which has a virtual function, the vftable
field is always the first one (allowing to find the memory address of the class of the object).
In my applications there are some objects, having vftable
entries (typically every STL
object has it), but there are also quite some objects who don't.
In order to force the presence of a vftable
field, I've done following test:
Create a nonsense class, having a virtual function, and let my class inherit from this nonsense class:
class NONSENSE {
virtual int nonsense() { return 0; }
};
class Own_Class : public NONSENSE, ...
This, as expected, created a vftable
entry in the symbols, which I could find (using Windbg
's x /2 *!Own_Class*vftable*
command):
00000000`012da1e0 Own_Application!Own_Class::`vftable'
I also saw a difference in memory usage:
sizeof(an normal Own_Class object) = 2928
sizeof(inherited Own_Class object) = 2936
=> 8 bytes have been added for this object.
There's a catch: apparently quite some objects are defined as:
class ATL_NO_VTABLE Own_Class
This ATL_NO_VTABLE
blocks the creation of the vftable
entry, which means the following (ATL_NO_VTABLE
equals __declspec(novtable)
):
// __declspec(novtable) is used on a class declaration to prevent the vtable
// pointer from being initialized in the constructor and destructor for the
// class. This has many benefits because the linker can now eliminate the
// vtable and all the functions pointed to by the vtable. Also, the actual
// constructor and destructor code are now smaller.
In my opinion, this means that the vftable
does not get created, because of which object methods get called more directly, having an impact on the speed of the method execution and stack handling. Allowing the vftable
to be created has following impact:
Not to be taken into account:
- There is one more call on the stack, this only has impact in case of systems which are already at the limit of their memory usage. (I have no idea how the linker points to a particular method)
- The CPU usage increase will be too small to be seen.
- The speed decrease will be too small to be seen.
To be taken into account:
- As mentioned before, the memory usage of the application increases by 8 bytes per object. When a regular object has a size of some 1000 bytes, this means a memory usage increase of ±1%, but for objects with a memory size of less than 80 bytes, this might cause a memory usage increase of +10%.
Now I have following questions:
- Is my analysis on the impact correct?
- Is there a better way to force the creation of the
vftable
field, having less impact? - Did I miss anything?
Thanks in advance