So before we get into the differences between pointers and references, I feel like we need to talk a little bit about declaration syntax, partly to explain why pointer and reference declarations are written that way and partly because the way many C++ programmers write pointer and reference declarations misrepresent that syntax (get comfortable, this is going to take a while).
In both C and C++, declarations are composed of a sequence of declaration specifiers followed by a sequence of declarators1. In a declaration like
static unsigned long int a[10], *p, f(void);
the declaration specifiers are static unsigned long int
and the declarators are a[10]
, *p
, and f(void)
.
Array-ness, pointer-ness, function-ness, and in C++ reference-ness are all specified as part of the declarator, not the declaration specifiers. This means when you write something like
int* p;
it’s parsed as
int (*p);
Since the unary *
operator is a unique token, the compiler doesn't need whitespace to distinguish it from the int
type specifier or the p
identifier. You can write it as int *p;
, int* p;
, int * p;
, or even int*p;
It also means that in a declaration like
int* p, q;
only p
is declared as a pointer - q
is a regular int
.
The idea is that the declaration of a variable closely matches its use in the code ("declaration mimics use"). If you have a pointer to int
named p
and you want to access the pointed-to value, you use the *
operator to dereference it:
printf( "%d\n", *p );
The expression *p
has type int
, so the declaration of p
is written
int *p;
This tells us that the variable p
has type "pointer to int
" because the combination of p
and the unary operator *
give us an expression of type int
. Most C programmers will write the pointer declaration as shown above, with the *
visibly grouped with p
.
Now, Bjarne and the couple of generations of C++ programmers who followed thought it was more important to emphasize the pointer-ness of p
rather than the int
-ness of *p
, so they introduced the
int* p;
convention. However, this convention falls down for anything but a simple pointer (or pointer to pointer). It doesn't work for pointers to arrays:
int (*a)[N];
or pointers to functions
int (*f)(void);
or arrays of pointers to functions
int (*p[N])(void);
etc. Declaring an array of pointers as
int* a[N];
just indicates confused thinking. Since []
and ()
are postfix, you cannot associate the array-ness or function-ness with the declaration specifiers by writing
int[N] a;
int(void) f;
like you can with the unary *
operator, but the unary *
operator is bound to the declarator in exactly the same way as the []
and ()
operators are.2
C++ references break the rule about "declaration mimics use" hard. In a non-declaration statement, an expression &x
always yields a pointer type. If x
has type int
, &x
has type int *
. So &
has a completely different meaning in a declaration than in an expression.
So that's syntax, let's talk about pointers vs. references.
A pointer is just an address value (although with additional type information). You can do (some) arithmetic on pointers, you can initialize them to arbitrary values (or NULL
), you can apply the []
subscript operator to them as though they were an array (indeed, the array subscript operation is defined in terms of pointer operations). A pointer is not required to be valid (that is, contain the address of an object during that object's lifetime) when it's first created.
A reference is another name for an object or function, not just that object's or function's address (this is why you don't use the *
operator when working with references). You can't do pointer arithmetic on references, you can't assign arbitrary values to a reference, etc. When instantiated, a reference must refer to a valid object or function. How exactly references are represented internally isn't specified.
- This is the C terminology - the C++ terminology is a little different.
- In case it isn't clear by now I consider the
T* p;
idiom to be poor practice and responsible for no small amount of confusion about pointer declaration syntax; however, since that's how the C++ community has decided to do things, that's how I write my C++ code. I don't like it and it makes me itch, but it's not worth the heartburn to argue over it or to have inconsistently formatted code.