I will discuss things in the context of your code, but I want to get some basics out of the way first.
In a declaration, the unary *
operator indicates that the thing being declared has pointer type:
T *p; // for any type T, p has type "pointer to T"
T *p[N]; // for any type T, p has type "N-element array of pointer to T"
T (*p)[N]; // for any type T, p has type "pointer to N-element array of T"
T *f(); // for any type T, f has type "function returning pointer to T"
T (*f)(); // for any type T, f has type "pointer to function returning T"
The unary *
operator has lower precedence then the postfix []
subscript and ()
function operators, so if you want a pointer to an array or a function, the *
must be explicitly grouped with the identifier.
In an expression, the unary *
operator dereferences the pointer, allowing us to access the pointed-to object or function:
int x;
int *p;
p = &x; // assign the address of x to p
*p = 10; // assigns 10 to x via p - int = int
After the above code has executed, the following are true:
p == &x // int * == int *
*p == x == 10 // int == int == int
The expressions p
and &x
have type int *
(pointer to int
), and their value is the (virtual) address of x
. The expressions *p
and x
have type int
, and their value is 10
.
A valid1 object pointer value is obtained in one of three ways (function pointers are also a thing, but we won't get into them here):
- using the unary
&
operator on an lvalue2 (p = &x;
);
- allocating dynamic memory via
malloc()
, calloc()
, or realloc()
;
- and, what is relevant for your code, using an array expression without a
&
or sizeof
operator.
Except when it is the operand of the sizeof
or unary &
operator, or is a string literal used to initialize a character array in a declaration, an expression of type "N-element array of T
" is converted ("decays") to an expression of type "pointer to T
", and the value of the expression is the address of the first element of the array3. So, if you create an array like
int a[10];
and pass that array expression as an argument to a function like
foo( a );
then before the function is called, the expression a
is converted from type "10-element array of int
" to "pointer to int
", and the value of a
is the address of a[0]
. So what the function actually receives is a pointer value, not an array:
void foo( int *a ) { ... }
String literals like "add"
and "five to two"
are array expressions - "add"
has type "4-element array of char
" and "five to two"
has type "12-element array of char
" (an N-character string requires at least N+1 elements to store because of the string terminator).
In the statements
mnemonic = "add";
operands = "five to two";
neither string literal is the operand of the sizeof
or unary &
operators, and they're not being used to initialize a character array in a declaration, so both expressions are converted to type char *
and their values are the addresses of the first element of each array. Both mnemonic
and operands
are declared as char *
, so this is fine.
Since the types of mnemonic
and operands
are both char *
, when you call
analyse_inst( mnemonic, operands );
the types of the function's formal arguments must also be char *
:
void analyse_inst( char *mnemonic, char *operands )
{
...
}
As far as the "pass by reference" bit...
C passes all function arguments by value. That means the formal argument in the function definition is a different object in memory from the actual argument in the function call, and any changes made to the formal argument are not reflected in the actual argument. Suppose we write a swap
function as:
int swap( int a, int b )
{
int tmp = a;
a = b;
b = tmp;
}
int main( void )
{
int x = 2;
int y = 3;
printf( "before swap: x = %d, y = %d\n", x, y );
swap( x, y );
printf( "after swap: x = %d, y = %d\n", x, y );
...
}
If you compile and run this code, you'll see that the values of x
and y
don't change after the call to swap
- the changes to a
and b
had no effect on x
and y
, because they're different objects in memory.
In order for the swap
function to work, we have to pass pointers to x
and y
:
void swap( int *a, int *b )
{
int tmp = *a;
*a = *b;
*b = tmp;
}
int main( void )
{
...
swap( &x, &y );
...
}
In this case, the expressions *a
and *b
in swap
refer to the same objects as the expressions x
and y
in main
, so the changes to *a
and *b
are reflected in x
and y
:
a == &x, b == &y
*a == x, *b == y
So, in general:
void foo( T *ptr ) // for any non-array type T
{
*ptr = new_value(); // write a new value to the object `ptr` points to
}
void bar( void )
{
T var;
foo( &var ); // write a new value to var
}
This is also true for pointer types - replace T
with a pointer type P *
, and we get the following:
void foo( P **ptr ) // for any non-array type T
{
*ptr = new_value(); // write a new value to the object `ptr` points to
}
void bar( void )
{
P *var;
foo( &var ); // write a new value to var
}
In this case, var
stores a pointer value. If we want to write a new pointer value to var
through foo
, then we must still pass a pointer to var
as the argument. Since var
has type P *
, then the expression &var
has type P **
.
- A pointer value is valid if it points to an object within that object's lifetime.
- An lvalue is an expression that refers to an object such that the object's value may be read or modified.
- Believe it or not there is a good reason for this rule, but it means that array expressions lose their "array-ness" under most circumstances, leading to much confusion among people first learning the language.