I'll give a shot at summarizing the (very short) mailing-list thread and adding some of my own recollections from my days doing compilers (using EDG's front-end as it was circa 2005–2010):
The Itanium C++ ABI is used by many POSIX-ish platforms. Several implementations exist (e.g. GNU has one, LLVM/Clang has one), and naturally they diverge slightly — both because bugs, and because they might care a little more or less about vendor-specific corner cases.
- For example, Clang mangles the (non-ISO-standard) block type
int (^)(int)
as U13block_pointerFiiE
. Neither Clang's nor GCC's c++filt
can round-trip that back into anything resembling int (^)(int)
. And GCC doesn't support block types at all.
Microsoft uses the Microsoft C++ ABI.
The original Cfront used what you might as well call "the Cfront C++ ABI," documented in §7.2c of the ARM. The default mangling used by Edison Design Group's Cfront-alike frontend is a lineal descendant of the Cfront C++ ABI.
- But EDG also supports the Itanium ABI (at least) as an alternative, configurable at compile-time (at least).
I'm sure there are other mangling schemes in existence, especially if you look at the Cambrian explosion of the 1990s and don't insist that the mangling must have survived all the way to the 2020s...
Here's a concrete example of these three manglings:
namespace Hello { class World; }
struct Class {
explicit Class(int);
int func(const double&, Hello::World*) const;
};
Itanium mangling: _ZN5ClassC1Ei
, _ZNK5Class4funcERKdPN5Hello5WorldE
.
Microsoft mangling: ??0Class@@QEAA@H@Z
, ?func@Class@@QEBAHAEBNPEAVWorld@Hello@@@Z
Cfront mangling: __ct__5ClassFi
, func__5ClassFRCdPQ25Hello5World
. (The ARM doesn't explain how to distinguish the complete-object constructor from the subobject constructor used for virtual bases; you'd have to look at an actual implementation, like EDG's, to figure that out.)
Orthogonally to name-mangling, there are plenty more ABI decisions to make:
Does exception handling use tables (as the Itanium ABI is specified to), or setjmp/longjmp prologue/epilogue code around each function?
Does the vtable contain simple function pointers (pointing to "thunks" when a cast to virtual base needs to be made), or a more complicated schema (eliminating the need for "thunk" code, but complicating each virtual call-site) as used in Cfront and shown in the ARM §10.8c?
What is the size and layout of int (Widget::*)()
? A pointer to non-virtual function can easily be the same size as int (*)()
, but what if Widget
contains virtual functions? The Itanium ABI says int (Widget::*)()
should be the size of two pointers. The Microsoft ABI, if I understand correctly, simply uses thunks for this too, and int (Widget::*)()
is the size of a single function pointer (which might point to a thunk).
- Weirdly, it is MSVC that is cool with casting from pointer-to-member-of-virtual-base to pointer-to-member-of-derived, and GCC/Clang who reject it. I haven't bothered to figure out why this should be so, and whether MSVC's codegen is actually doing the right thing here. It might have to do with...
What is the struct layout of virtual bases — at the front of the class, end of the class, in between (as shown, although maybe by accident, in the ARM §10.5c)? And where are their offsets stored at runtime? — we'll need to know those offsets every time we cast a pointer of unknown dynamic type to one of its virtual bases. Microsoft ABI uses both a "vfptr" (to the virtual function table) and a "vbptr" (to a table of virtual base offsets), whereas Itanium ABI uses only a vfptr (which they just call "the vptr"), and the virtual base offsets are stored in memory preceding the vtable.
struct E { virtual ~E(); };
struct F : virtual E {};
— Microsoft makes sizeof(F)==16
, whereas Itanium and Cfront make it only 8
.
How do we do inline functions and templates: prelinker? COMDAT sections (and if so, how do we name them)?
And probably many more knobs I haven't thought of.