Name mangling

In compiler construction, name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

It provides means to encode added information in the name of a function, structure, class or another data type, to pass more semantic information from the compiler to the linker.

The need for name mangling arises where a language allows different entities to be named with the same identifier as long as they occupy a different namespace (typically defined by a module, class, or explicit namespace directive) or have different type signatures (such as in function overloading). It is required in these uses because each signature might require different, specialized calling convention in the machine code.

Any object code produced by compilers is usually linked with other pieces of object code (produced by the same or another compiler) by a type of program called a linker. The linker needs a great deal of information on each program entity. For example, to correctly link a function it needs its name, the number of arguments and their types, and so on.

The simple programming languages of the 1970s, like C, only distinguished subroutines by their name, ignoring other information including parameter and return types. Later languages, like C++, defined stricter requirements for routines to be considered "equal", such as the parameter types, return type, and calling convention of a function. These requirements enable method overloading and detection of some bugs (such as using different definitions of a function when compiling different source code files). These stricter requirements needed to work with extant programming tools and conventions. Thus, added requirements were encoded in the name of the symbol, since that was the only information a traditional linker had about a symbol.

Another use of name mangling is for detecting added non-signature related changes, such as function purity, or whether it can potentially throw an exception or trigger garbage collection. An example of a language doing this is D. These are more of a simplified error checking. For example, functions int f(); and int g(int) pure; could be compiled into one object file, but then their signatures changed to float f(); int g(int); and used to compile other source calling it. At link time the linker will detect there is no function f(int) and return an error. Similarly, the linker will not be able to detect that the return type of f is different, and return an error. Otherwise, incompatible calling conventions would be used, and most likely produce the wrong result or crash the program. Mangling doesn't usually capture every detail of the calling process. For example, it doesn't fully prevent errors like changes of data members of a struct or class. For example, struct S {}; void f(S) {} could be compiled into one object file, then the definition for S changed to be struct S { int x; }; and used in the compiling of a call to f(S()). In such cases, the compiler will usually use a different calling convention, but in both cases f will mangle to the same name, so the linker will not detect this problem, and the result will usually be a crash or data- or memory corruption at runtime.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.