First, remember that x
and y
are variables that exist independently of whatever they point to. The initial values of x
and y
are indeterminate - they could be 0x00000000
, they could be 0xdeadbeef
, they could be a bit pattern that doesn't correspond to a valid address value at all.
The space for the x
and y
variables has to be taken from somewhere, and since memory isn't infinite, memory locations get reused; some memory locations get reused a lot. Memory doesn't automatically get erased1 when you're done with it in most implementations, so when you create a new object, it will contain the bit pattern of whatever was last written to those bytes2.
C has a concept of a lifetime for objects, which is the period of your program's execution where storage is guaranteed to be reserved for that object. A pointer is valid if it stores the address of an object during that object's lifetime. Valid pointer values are obtained in one of two ways:
- using the
&
operator on an object during that object's lifetime
- calling
malloc
, calloc
, or realloc
, to dynamically allocate space for an object, as you do for x
3.
For example:
void foo( void )
{
int *ptr; // ptr is initially indeterminate and invalid
for ( int i = 0; i < 10; i++ )
{
ptr = &i; // i's lifetime is each iteration of the for loop;
printf( "%d = %d\n", *ptr, i ); // ptr is valid within the loop;
}
// ptr still stores the address of i, but i's lifetime has ended,
// so ptr is *no longer valid* - attempting to read or write it now
// will lead to undefined behavior
}
After i
's lifetime has ended, the space that was reserved for it can be used by something else. If we try to read or write to it through ptr
after the loop has finished the result may not be what we expect. The behavior of doing this is undefined, meaning the compiler and runtime environment aren't required to handle the situation in any particular way. It may work as we expect, we may corrupt data somewhere, we may cause a runtime error, or anything else can happen.
Similarly, executing
*y = 13;
in your program will have undefined behavior, because y
does not store the address of an object in your program during that object's lifetime. Literally anything can happen at this point - your program can appear to work as intended, you can corrupt data elsewhere in your program, you can cause your program to branch off into a random function, you can cause a runtime error, or literally anything else can happen. And the result can be different each time you run it.
Edit
Addressing a question in the comments:
Are you referring to pointers here? Can pointers be considered as object? or is it just the ints and chars that are to be called as object?
Yes, the pointer variables x
and y
are objects (in the C sense that they're regions of memory that can store values). To better illustrate this, I wrote the following:
#include <stdio.h>
#include <stdlib.h>
#include "dumper.h"
int main( void )
{
int *x;
int *y;
int a;
char *names[] = { "a", "x", "y", "*x", "*y" };
void *addrs[] = { &a, &x, &y, NULL, NULL };
size_t sizes[] = { sizeof a, sizeof x, sizeof y, sizeof *x, sizeof *y };
puts( "Initial states of a, x, and y:" );
dumper( names, addrs, sizes, 3, stdout );
x = calloc( 1, sizeof *x ); // makes sure *x is initialized to 0
if ( x )
{
addrs[3] = x;
puts( "States of a, x, and y after allocating memory for x" );
dumper( names, addrs, sizes, 4, stdout );
*x = 0x11223344;
puts( "States of a, x, y, and *x after assigning *x" );
dumper( names, addrs, sizes, 4, stdout );
}
y = &a;
addrs[4] = y;
puts( "States of a, x, y, *x, and *y after assigning &a to y" );
dumper( names, addrs, sizes, 5, stdout );
*y = 0x55667788;
puts( "States of a, x, y, *x, and *y after assigning to *y" );
dumper( names, addrs, sizes, 5, stdout );
free( x );
return 0;
}
dumper
is a little utility I wrote to dump the address and contents of the objects to a specified output stream.
After building and running the code, I get this output for the initial states of my variables:
Initial states of a, x, and y:
Item Address 00 01 02 03
---- ------- -- -- -- --
a 0x7ffee3bc59f4 2c b3 0c 1b ,...
x 0x7ffee3bc5a00 01 00 00 00 ....
0x7ffee3bc5a04 00 00 00 00 ....
y 0x7ffee3bc59f8 80 5b bc e3 .[..
0x7ffee3bc59fc fe 7f 00 00 ....
The variable a
lives at address 0x7ffee3bc59f4
and takes up 4 bytes - its initial contents for this run are 0x1b0cb32c
(x86 is little-endian, so bytes are ordered from least-significant to most-significant). Since a
isn't explicitly initialized, its initial contents are indeterminate - each time I run this program the initial value of a
will likely be different (as will its address - as a defense against malware, most OSes randomize locations from run to run).
The variable x
lives starting at address 0x7ffee3bc5a04
and takes up 8 bytes (the stack on x86 grows "downwards", so we start from the higher address). Similarly, the variable y
lives at address 0x7ffee3bc59fc
and also takes 8 bytes. Like a
, the initial contents of x
and y
are indeterminate and will vary from run to run.
After allocating space for an int
object that x
will point to, I have this:
States of a, x, and y after allocating memory for x
Item Address 00 01 02 03
---- ------- -- -- -- --
a 0x7ffee3bc59f4 2c b3 0c 1b ,...
x 0x7ffee3bc5a00 a0 25 50 1e .%P.
0x7ffee3bc5a04 c2 7f 00 00 ....
y 0x7ffee3bc59f8 80 5b bc e3 .[..
0x7ffee3bc59fc fe 7f 00 00 ....
*x 0x7fc21e5025a0 00 00 00 00 ....
The variable x
now stores the value 0x7fc21e5025a0
, which is the address of a block of memory large enough to store an int
value. Since I used calloc
to allocate the memory, the initial contents of it are all-bits-0. I can now assign a new int
value to that object through the expression *x
, which gives me:
States of a, x, y, and *x after assigning *x
Item Address 00 01 02 03
---- ------- -- -- -- --
a 0x7ffee3bc59f4 2c b3 0c 1b ,...
x 0x7ffee3bc5a00 a0 25 50 1e .%P.
0x7ffee3bc5a04 c2 7f 00 00 ....
y 0x7ffee3bc59f8 80 5b bc e3 .[..
0x7ffee3bc59fc fe 7f 00 00 ....
*x 0x7fc21e5025a0 44 33 22 11 D3".
So I've updated the int
object that x
points to (i.e., stores the address of).
Finally, I set y
to point to a
, giving me:
States of a, x, y, *x, and *y after assigning &a to y
Item Address 00 01 02 03
---- ------- -- -- -- --
a 0x7ffee3bc59f4 2c b3 0c 1b ,...
x 0x7ffee3bc5a00 a0 25 50 1e .%P.
0x7ffee3bc5a04 c2 7f 00 00 ....
y 0x7ffee3bc59f8 f4 59 bc e3 .Y..
0x7ffee3bc59fc fe 7f 00 00 ....
*x 0x7fc21e5025a0 44 33 22 11 D3".
*y 0x7ffee3bc59f4 2c b3 0c 1b ,...
The value stored in the variable y
is the address of the variable a
: 0x7ffee3bc59f4
. As you can see, the expression *y
holds the same value as the variable a
. I can now change the value of a
by writing to *y
, which leaves us with:
States of a, x, y, *x, and *y after assigning to *y
Item Address 00 01 02 03
---- ------- -- -- -- --
a 0x7ffee3bc59f4 88 77 66 55 .wfU
x 0x7ffee3bc5a00 a0 25 50 1e .%P.
0x7ffee3bc5a04 c2 7f 00 00 ....
y 0x7ffee3bc59f8 f4 59 bc e3 .Y..
0x7ffee3bc59fc fe 7f 00 00 ....
*x 0x7fc21e5025a0 44 33 22 11 D3".
*y 0x7ffee3bc59f4 88 77 66 55 .wfU
There's nothing magic about pointer variables - they're just chunks of memory that store a certain type of value (an address). Different pointer types may have different sizes and/or representations (i.e., an int *
variable may look different from a char *
variable, which may look different from a struct foo *
variable). The only rules are
char *
and void *
have the same size and alignment;
- Pointers to qualified types have the same size and alignment as pointers to their unqualified equivalents (i.e.,
const int *
and int *
should have the same size and alignment);
- All
struct
pointer types have the same size and alignment (e.g., struct foo *
and struct bar *
look the same);
- All
union
pointer types have the same size and alignment;
Operations on pointer values are special, and the syntax for them can be confusing. But pointers are just another data type, and pointer variables are just another kind of object.
- That is, set to all-bits-0 or some other well-defined "not a value" bit pattern.
- We're not going to get into the distinction between virtual and physical memory here.
- You're not allocating space for
x
itself - you're allocating space for an int
object that x
will point to.