Why are member function pointers different from normal function pointers in C++?

Question

In the beginning, there was C.
And C had structure, and expressions, and functions to package them. And it was good.
But C also had goto and switch case falling and syntax that followed use, so maybe not that good.

It also had pointers, causing much gnashing of teeth from aliasing and pointer arithmetic!
But it also had function pointers, allowing run time dispatch, and much rejoicing followed.
Now, data could dictate code, as well as code dictating data, and both were First Class (or close).
For anything that could be pointed to, could be pointed to, with the same pointer: the holy void*.

And all were equal in it's glory.

Then C++ came, and bound the data and the code into the Object.
And lo, this was just syntactical sugar, for a function and a method are not so different,
(no matter what the Sun or the Oracle may tell you).

Obj->Foo(int val) being (approximately) the same as Foo(Obj* this, int val), they were still equal,
under the holy void*.

And then, with inheritance, came strife, as the outer Derived class may add to the inner Base.
But lo, in these simple times, a solution was found: put every Base before Derived.
Then, with the same pointer, we may point to both child and father.

And still, anything that could be pointed to, could be pointed to with the holy void*.

With the Virtual, we lost our simplicity, and wandered for ages.
Wondering, how to deal with diamonds, or circles which are not quite ellipses.
But even as we cast off our old tenants of C, with each instruction reducing to simple asm,
we embraced, that some things should look simple (even if, under, they are complex).

And so, we looked at the secret VTables that arose and thought "good enough".
For while we had introduced hidden data, we had reduced complexity.
And now, any call made through a Virtual was redirected through a VTable.
And as all subobjects of a class could be pointed to through a single pointer, this was enough.

But even though the dispatch method had changed, we could still point to all things, with holy void*.

But then was made, what many in the present time, consider a grave mistake: Multiple Inheritance.
And no longer would one pointer suffice! For how can both Base fathers be at the start?
And no longer, then, would a VTable suffice, for how would we know which subobject to point to!
And now, the problem of diamonds arises even worse than before, with no obvious solution, and our prior ones requiring current code to deal with future prospects!

And so, pointer adjustments must be made, with each Virtual call.
Because a Base class may really be a MI Derived class in disguise, and needs to be adjusted.
So for the support of the Choice Few who used MI, we all paid a price.

And suddenly, the holy void* could no longer store what had hither-forth been simple sugar.
And what arose to deal with this complexity, was the dreaded member function pointer.

The beast required it's own syntax, for none others would suffice.
And this syntax, as rarely used, would be so low in precedence as to require parans with every use.
Though in the holy Standard, what wicked council decided to allow such wretched things to be Cast,
but when Cast from type to type, not Called without invoking behavior most undefined!

And nay, decadent with syntax and greedy, they were Fat and could not fit into void*.
As the only way to know what object to point to, was with an adjustment,
nestled deep in the pointer, and checked with each VTable lookup.

But this, brothers, is not how it must be.
This complexity of implementation comes from a most peculiar of decisions.

class Base1
{
public:
    virtual void foo();
};

class Base2
{
public:
    virtual void bar();
};

class Derived: public Base1, public Base2
{
public:
    void unrelated();
}

As is seen here, Derived* must be adjusted when calling foo() or bar(); it cannot point to Base1 and Base2 at the same time, as in the case of simple single inheritance. It is, indeed, impossible to properly predict how much of an offset is needed, when called from a base class, which is why most have some sort of mechanism for adding it into the vtable.

However:

class Derived: public Base1, public Base2
{
public:
    void unrelated();

    virtual void foo() { Base1::foo(); }
    virtual void bar() { Base2::bar(); }
}

Solves the problem, with nary a change to the original object model!
As each method now exists, it can be added properly to the vtable, and when called, knows exactly how much to adjust the pointer, allowing the call to proceed without any mucking about!
And now, both casting a member function pointer as well as calling it when cast, are well defined!

And most importantly of all, anything that can be pointed to, can be pointed to by a holy void*.
All a member function pointer need be is a normal function pointer that takes a special first parameter.
And lo, things would have been pretty good.

If only we lived in such a dream world.

Unfortunately for us, we live with fat member function pointers, vtables that have to adjust each and every call, and member function pointers that cannot appropriately create delegates or many other useful patterns. In MSVC, they change size when you cast them!

The problem is so great that std::function can allocate memory dynamically in most implementations, because of the various problems outlined here. The use of thunks for unoverriden methods, as I have detailed, would solve this problem quite handily, and the cost of this is a few inlinable hidden methods, and some vtable changes, along with a possible (but tiny) decrease in the speed of a virtual dispatch, in the case the function isn't overriden and also can't be inlined perfectly.

And for this tiny negligible increase in speed and space, we've created a monstrosity in member function pointers, and influenced a half dozen other languages not to use multiple inheritance despite the easily solved problems, castrated member function pointers in use, and made our delegates slower.

In fact, as noted in The Fastest Possible Delegates, the current solution actually slows down every Virtual invocation; it forces extra checking, and extra memory usage, via fat pointers, which even for single inheritance, must store extra data (or risk losing it, as MSVC member function pointers can). This is clearly not in the "pay if you use, not if you don't" philosophy of C++!

So, to reiterate, why are member function pointers different from "loose" function pointers? Is there any logical reason why they are not simply function pointers with a special calling convention, or with an extra argument for the "this"?

Because method(function) and data are bind(encapsulated) in the logical unit "class" by language to support this. — Mantosh Kumar, Mar 25 '14 at 02:24
Your proposed solution fails to permit the caller to specify which of two methods is to be used in the case of diamond shaped inheritance where both intermediate superclasses override the original method. — Warren Dew, Mar 25 '14 at 02:36
@aruisdante The question is quite clear, and asked twice: Why are member function pointers different from normal function pointers, when minor adjustments to the standard could allow them to be? — Alice, Mar 25 '14 at 02:36
@WarrenDew The proposed solution has nothing to do with the diamond problem, as that is adequately solved via virtual inheritance, which does not cause member function pointers to become grotesque. Also, you could always use the scope operator, same as you can now. — Alice, Mar 25 '14 at 02:37
The first paragraph says nothing about pointers. It's a brief mention of a few features of the C language, and a value judgement on their goodness. Not relevant. — Benjamin Lindley, Mar 25 '14 at 03:14

score 6 · Answer 1 · answered Mar 25 '14 at 02:33

6

C++ has a rule of "don't pay for what you don't use," meaning that normal operations shouldn't be slowed down to pay for other language features the programmer isn't using. While you absolutely could make member function pointers the same as normal function pointers, the extra overhead from factoring in dynamic dispatch, different vtable offsets, different base object offsets and thunks, etc. would introduce extra overhead into normal function pointers, either due to the extra memory needed to store this information (larger size) or the extra logic required to do the dispatch (extra time and code generated). Therefore, it makes sense to split function pointers and member function pointers apart into separate types with separate implementations.

Hope this helps!

answered Mar 25 '14 at 02:33

templatetypedef

362,284
104
897
1,065

This is categorically wrong; please read the last paragraph. In order to support this not very often used functionality and reduce the cost of overridden virtual functions by a negligible amount, C++ has crippled member function pointers, introduced drastically more syntactical problems through the special member function and member data syntax, and made delegates and generalized function holders/callers like std::function possibly invoke dynamic memory allocations! We all pay the cost for this. – Alice Mar 25 '14 at 02:35
1

Excellent answer. Alice's solution can always be implemented manually by programmers who want to pay the overhead to use it. – Warren Dew Mar 25 '14 at 02:37
@WarrenDew That's clearly not true; std::function can cause dynamic memory allocations, which is why the fastest delegate implementations almost always invoke undefined behavior in order to get speed. [see here)](http://www.codeproject.com/Articles/7150/Member-Function-Pointers-and-the-Fastest-Possible) – Alice Mar 25 '14 at 02:39
@Alice But the cost is our souls; not performance. – Lilshieste Mar 25 '14 at 02:39
@Lilshieste The cost is performance; vtables in all modern C++ compilers become grotesque and quite slow. GCC does a weird optimization where they double enlarge the vtable in order to store adjustments. For this very strange "optimization", we pay a huge cost. – Alice Mar 25 '14 at 02:40
3

@Alice The C++ philosophy is "if you don't want to pay for `std::function`, you don't have to use it. You can always just raw function pointers or member function pointers instead." Perhaps I'm missing the point of your question, though - am I correct that your question is "why does C++ separate function pointers from member function pointers?" Also, can you please elaborate on how I'm "categorically wrong," how C++ has "crippled" member function pointers, and how there are "drastically more syntactical problems?" – templatetypedef Mar 25 '14 at 02:46
@templatetypedef Again, read the prior comment. This isn't "if you don't want to pay for std::function"; in order to implement this behavior, each and every call to a virtual function pays a cost, in the upkeep of adjustment fields and the expansion of vtables that occurs. In MY solution, the cost is only paid by virtual functions which do not override. Which is more in the spirit of C++, paying a cost no matter what, or paying a cost if you use it? – Alice Mar 25 '14 at 02:48
1

@Alice If you don't use inheritance, you don't pay the cost of vtables. Member functions need to be as fast as possible (damn the torpedoes!). You can write a lot of OO code without using inheritance at all. The same can't be said about member functions - they're a core concept to OO. – Lilshieste Mar 25 '14 at 02:49
@Lilshieste I'm saying that the current way it is done is actually more costly in the general case than the proposed solution. – Alice Mar 25 '14 at 02:49
For those who don't understand why this is the case, [please see here](http://www.codeproject.com/Articles/7150/Member-Function-Pointers-and-the-Fastest-Possible) for an explanation as to the internals of how compilers make vtables. Because member function pointers and vtables don't know what pointer they might be adjusting, they pay a cost at every invocation, even for classes which do not invoke this behavior. With my solution, there would be much smaller cost only on those classes (and invocations) which invoke this behavior. This is clearly more "pay as you go" than the current solution. – Alice Mar 25 '14 at 02:51
It seems like your question is actually different from what you've asked. You're asking "why is inheritance and dynamic dispatch implemented as it is in most C++ implementstions?" and not "why are function and member function pointers separate?" You might want to ask this question more explicitly, possibly in a separate question. – templatetypedef Mar 25 '14 at 02:52
@templatetypedef Those two are intimately related; the reason they are separate is because they are implemented in this fashion; I am asking for a logical reason why this is so. – Alice Mar 25 '14 at 02:53
2

Also, there's one last detail you're missing - the C++ compiler doesn't necessarily have the entire inheritance diagram available at the time it compiles a single class. Since C++ supports one-pass compilation, the compiler would need to generate the best code it can as soon as it sees the class, meaning that it may have to pessimistically assume there will be multiple inheritance or other odd dynamic dispatches going on. Under those conditions, the solution that's worked so far is the one that we are actually using. – templatetypedef Mar 25 '14 at 02:54
@Alice It's more costly from a development standpoint, but once the code is written there's no extra cost. (Mind you, I share your frustrations - I'm just insisting that the reasons for these decisions are well known, and are accurately described in this answer.) – Lilshieste Mar 25 '14 at 02:57
@templatetypedef Not true; you cannot instantiate a class if you do not have it's full definition, only a pointer to that class (as a pointer to a class will always be the same size: a holy void*). – Alice Mar 25 '14 at 02:57
@Lilshieste No, it's more costly from an execution stand point. Each and every call through a virtual function now must pay an adjustment to pointer cost, even if they do not actually need to adjust their pointer; this is clearly less performant than only those which need to be adjusted being adjusted. – Alice Mar 25 '14 at 02:58
@Alice Sorry, let me clarify. When the compiler generates code for a class, it has to generate that code independently of its subclasses because the compiler can't tell in advance what subclasses, if any, exist. Therefore, the compiler needs to generate the vtables under pessimal assumptions. This is a separate issue from whether you can use a class without it being declared. Consequently, vtables can get a bit messy because they have to account for the possibility that code that hasn't been generated yet will interact with the code currently being generated. Does that make sense? – templatetypedef Mar 25 '14 at 03:02
@templatetypedef The only reason those vtables get messy is because of the case discussed in my question; if that case did not exist (for example, if my thunk solution was implemented), then vtables would not need to get messy due to pessimal assumptions. This would lead to the general case of vtable dispatch being faster (as there would be no need to adjust the pointers or check to see if they need to be adjusted), while the special case would be largely the same (it would need an extra inlinable call but it would also need no checks, as it knows what it is). – Alice Mar 25 '14 at 03:07
@templatetypedef The messiness ONLY ARISES because of this corner case, which if eliminated, would allow method function pointers to not be "fat". Does this make sense? – Alice Mar 25 '14 at 03:07
@Alice I think I see what you're saying now. It looks like there is a bit of confusion in all this discussion about the precise problem you're trying to solve (e.g., member function pointer syntax in general vs. member function pointers when used in inheritance). I'd be interested in following along future discussions on this, if you start any. – Lilshieste Mar 25 '14 at 03:24
@Lilshieste The problem, and it's not one with a solution per se, is that C++ made a decision with regards to member function pointers, which had broad implications on how other things (such as virtual inheritance) can be implemented. If member function pointers had been stated to be identical to normal function pointers with a special calling convention or a required this argument, all of this messiness would have been solved, say, through my proposed solution. The question is, why did they decide on this mechanism? I don't know; I'm not on the standard board. Thus the question. – Alice Mar 25 '14 at 03:27

Why are member function pointers different from normal function pointers in C++?

1 Answers1