More
mangling
In
Chapter 1 the concept of
name
mangling
was
introduced. (Sometimes the more gentle term
decoration
is
used.) In the code
void f();
class X { void f(); };
the
function
f( )
inside the scope of
class
X
does not clash with the global version of
f( ).
The compiler performs this scoping by manufacturing different internal names
for the global version of
f( )
and
X::f( ).
In Chapter 1 it was suggested that the names are simply the class name
“mangled” together with the function name, so the internal names
the compiler uses might be
_f
and
_X_f.
It turns out that function name mangling involves more than the class name.
Here’s
why. Suppose you want to overload two function names
void print(char);
void print(float);
It
doesn’t matter whether they are both inside a class or at the global
scope. The compiler can’t generate unique internal identifiers if it uses
only the scope of the function names. You’d end up with
_print
in both cases. The idea of an overloaded function is that you use the same
function name, but different argument lists. Thus, for overloading to work the
compiler must mangle the names of the argument types
with the function name. The above functions, defined at global scope, produce
internal names that might look something like
_print_char
and
_print_float.
It’s worth noting there is no standard for
the way names must be mangled by the compiler, so you will see very different
results from one compiler to another. (You can see what it looks like by
telling the compiler to generate assembly-language output.) This, of course,
causes problems if you want to buy compiled libraries for a particular compiler
and linker, but those problems can also exist because of the way different
compilers generate code. That’s
really all there is to function overloading: You can use the same function name
for different functions, as long as the argument lists are different. The
compiler mangles the name, the scope, and the argument lists to produce
internal names for it and the linker to use.
Overloading
on return values
It’s
common to wonder “why just scopes and argument lists? Why not return
values?” It seems at first that it would make sense to also mangle the
return value with the internal function name. Then you could overload on return
values,
as well:
void f();
int f();
This
works fine when the compiler can unequivocally determine the meaning from the
context, as in
int
x = f( );
.
However, in C you’ve always been able to call a function and ignore the
return value. How can the compiler distinguish which call is meant in this
case? Possibly worse is the difficulty the reader has in knowing which function
call is meant. Overloading solely on return value is a bit too subtle, and thus
isn’t allowed in C++.
Type-safe
linkage
There
is an added benefit to all this name mangling. A particularly sticky problem in
C occurs when the user misdeclares a function, or, worse, a function is called
without declaring it first, and the compiler infers the function declaration
from the way it is called. Sometimes this function declaration is correct, but
when it isn’t, it can be a very difficult bug to find.
Because
all functions
must
be declared before they are used in C++, the opportunity for this problem to
pop up is greatly diminished. The compiler refuses to declare a function
automatically for you, so it’s likely you will include the appropriate
header file. However, if for some reason you still manage to misdeclare a
function, either by declaring it yourself by hand or by including the wrong
header file (perhaps one that is out of date), the name-mangling provides a
safety net that is often referred to as
type-safe
linkage. Consider
the following scenario. In one file is the definition for a function:
//: C07:Def.cpp {O}
// Function definition
void f(int) {}
///:~
In
the second file, the function is misdeclared and then called:
//: C07:Use.cpp
//{L} Def
// Function misdeclaration
void f(char);
int main() {
//! f(1); // Causes a linker error
} ///:~
Even
though you can see that the function is actually
f(int),
the compiler doesn’t know this because it was told – through an
explicit declaration – that the function is
f(char).
Thus, the compilation is successful. In C, the linker would also be successful,
but
not
in C++. Because the compiler mangles the names, the definition becomes
something like
f_int,
whereas the use of the function is
f_char.
When the linker tries to resolve the reference to
f_char,
it can find only
f_int,
and it gives you an error message. This is type-safe linkage. Although the
problem doesn’t occur all that often, when it does it can be incredibly
difficult to find, especially in a large project. This is one of the cases
where you can find a difficult error in a C program simply by running it
through the C++ compiler.
Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru