Pointers to pointers vs. normal pointers

Question

The purpose of a pointer is to save the address of a specific variable. Then the memory structure of following code should look like:

int a = 5;
int *b = &a;

...... memory address ...... value
a ... 0x000002 ................... 5
b ... 0x000010 ................... 0x000002

Okay, fine. Then assume that now I want to save the address of pointer *b. Then we generally define a double pointer, **c, as

int a = 5;
int *b = &a;
int **c = &b;

Then the memory structure looks like:

...... memory address ...... value
a ... 0x000002 ................... 5
b ... 0x000010 ................... 0x000002
c ... 0x000020 ................... 0x000010

So **c refers the address of *b.

Now my question is, why does this type of code,

int a = 5;
int *b = &a;
int *c = &b;

generate a warning?

If the purpose of pointer is just to save the memory address, I think there should be no hierarchy if the address we are going to save refers to a variable, a pointer, a double pointer, etc., so the below type of code should be valid.

int a = 5;
int *b = &a;
int *c = &b;
int *d = &c;
int *e = &d;
int *f = &e;

Beside so many good answers, may I please post a simple comment. Clang compiler issues this unambiguous warning when trying to compile the questioned part of your code: `warning: incompatible pointer types initializing 'int *' with an expression of type 'int **'; remove & [-Wincompatible-pointer-types]`. This might have made everything clear. — user3078414, Jun 28 '16 at 13:34
Beginners often get confused because they consider "adresses" as a data type per se.They are not. Adresses of data of type X are. And they are different for different types. This led you to believe int * and int ** to be the same. — Michel Billaud, Jun 28 '16 at 15:00
The thought of pointers storing numeric values that are used with machine language loads and stores is an implementation detail, not an aspect of the C language. In fact, the standard makes a point of making very few guarantees about what a pointer 'actually is'. e.g. in one clause it uses the phrase `An object ... has constant address` but in a footnote it clarifies `The term ‘‘constant address’’ means that two pointers to the object constructed at possibly different times will compare equal.` — , Jun 28 '16 at 15:38
"*if the purpose of pointer is just to save the memory address*", it is not. The purpose of a pointer is to save the "memory address" of an object along with its type. Just start dereferencing the pointers and you'll see. — Margaret Bloom, Jun 28 '16 at 15:42
Sorry, not to be rude, but just wondering, what makes _this_ question so useful? This is there in any moderate C book, Pointers chapter second or third article, at max, not to mention, discussed many times in SO. Did I miss something obvious? — Sourav Ghosh, Jun 28 '16 at 19:21
An "int pointer" is an object that holds the address of an integer. A "double pointer" is an object that holds the address of a double. A "pointer to a pointer" is an object that holds the address of a pointer, but is not well specified. — William Pursell, Jun 28 '16 at 20:45
What do you know about using a pointer to point into an array? — Eric Towers, Jun 29 '16 at 03:11
Pointer types matter. It may not be very visible when most of the memory access you do is on a flat memory address space, but that isn't always true, and has only become the norm on x86 in the 32-bit era forwards. 16-bit applications, segmented mode applications and some embedded CPUs may have many different physical representations of a pointer depending on many different factors (allocation type, pointer type, compiler optimizations...). Enjoy the simple life you have today, but don't forget the conditions in which C was designed and used. — Luaan, Jun 29 '16 at 13:49
See also debate about rating of C programmers: [3 star programmer](http://c2.com/cgi/wiki?ThreeStarProgrammer) at the "original" wiki - C2. — Peter M. - stands for Monica, Jun 29 '16 at 13:58
@Luaan CPUs and memory don't really matter. I'd argue that having this type hierarchy simply helps writing correct code. If we collapsed it (assuming that this could be done) you would not have the type system to help you knowing that a certain functions needs a pointer-to-pointer instead of a simple pointer and compile-time warnings/errors for these situations. So just type safety for me is a good argument, without even considering all other possible reasons behind this. — Bakuriu, Jun 29 '16 at 21:46
@SouravGhosh the guy probably boosted his question for rep by linking it from some well populated beginner forum — M.M, Jun 30 '16 at 01:15
@Bakuriu Oh, definitely - but I don't think that was an important consideration in C. Look at the gigantic swathes of undefined behaviour C has - it definitely isn't designed for avoiding silly mistakes, it depends on you knowing exactly what you're doing and not making mistakes. It really is a slightly high-level assembly (especially compared to how assembly looks nowadays :D) on an abstract machine that was meant to be universably implementable. That part actually worked out rather well - if you avoid all UB, C code is very portable (though you usually need some platform-specific code). — Luaan, Jun 30 '16 at 09:56

score 91 · Accepted Answer · edited Jun 29 '16 at 16:10

91

In

int a = 5;
int *b = &a;   
int *c = &b;

You get a warning because &b is of type int **, and you try to initialize a variable of type int *. There's no implicit conversions between those two types, leading to the warning.

To take the longer example you want to work, if we try to dereference f the compiler will give us an int, not a pointer that we can further dereference.

Also note that on many systems int and int* are not the same size (e.g. a pointer may be 64 bits long and an int 32 bits long). If you dereference f and get an int, you lose half the value, and then you can't even cast it to a valid pointer.

edited Jun 29 '16 at 16:10

Toby Speight

27,591
48
66
103

answered Jun 28 '16 at 13:04

Some programmer dude

400,186
35
402
621

4

And on systems that C was designed for (as well as some modern embedded systems), there were different kinds of pointers in the same program - for example, near and far pointers, or data and code pointers. *Pointer is not an int value, people; stop pretending it is just because it "mostly works"*. – Luaan Jun 29 '16 at 13:35
2

@Luaan: C was designed for the early DEC minicomputers, and was most common on the PDP-11, from the 1970s. Those machines did NOT have "near" and "far" pointers. "Near" and "far" were a Microsoft extension to C, to support the brain-dead Intel 8086/8088 segmented architecture on the IBM PC. (At the time that the PC project started, IBM was producing and selling a Motorola 68000-based laboratory computer at another site. Imagine the savings in aspirin alone, from segment headaches avoided, if the two sites had talked...) – John R. Strohm Jun 29 '16 at 14:50
1

@Luaan not actually true. It wasn't until other vendors and ANSI/ISO got their hands on it that that kind of portability concern was addressed. 1972 C was perfectly sanguine about interchanging pointers and ints; both were 16-bit and pointers were flat. `void *` didn't even exist, you could always just use `char *` :) – hobbs Jun 29 '16 at 16:13

score 53 · Answer 2 · answered Jun 28 '16 at 13:30

53

If the purpose of pointer is just to save the memory address, I think there should be no hierarchy if the address we are going to save refers variable, pointer, double pointer, ... etc

At runtime, yes, a pointer just holds an address. But at compile time there is also a type associated with every variable. As the others have said, int* and int** are two different, incompatible types.

There is one type, void*, that does what you want: It stores only an address, you can assign any address to it:

int a = 5;
int *b = &a;
void *c = &b;

But when you want to dereference a void*, you need to supply the 'missing' type information yourself:

int a2 = **((int**)c);

answered Jun 28 '16 at 13:30

alain

11,939
2
31
51

If I understand correct, the type of pointer tells how many memories cpu reads starting from the refering memory address. So the expression 'int * b = &a; printf("%d', * b); ' means that starting from address of a, we read 4 byte. This is possible because we defined the size of int as 4 byte, and print that interger. But then what is the size of int*, int**, int***? Does that differ from system to system? Then what (compiler, cpu, or other?) defines its size? – user42298 Jun 29 '16 at 11:06
1

The size of the type is only one consideration. There are others. – Gregory Currie Jun 29 '16 at 11:54
Consider a pointer to something p. Now imagine you want to store the value that p points to. int i = *p; The compiler needs to know what the type of the pointer is. In memory, a double looks a lot different to an int. This has nothing to do with the size of the pointer, but how it should treat the data that is pointed to. – Gregory Currie Jun 29 '16 at 12:00
1

@user42298 The size of pointers typically depends on the CPU and the compiler. A program compiled for 64 bits, running on a 64 bit CPU, has 64 bit wide pointers. But you could also compile for 32 bits and run it on a 64 bit CPU, then the pointers are 32 bit, and the operating system takes care of running that correctly. Btw an `int` can have different sizes too, but `int32_t` is guaranteed to have exactly 32 bits. – alain Jun 29 '16 at 12:52
This is the most comprensive answer amongst the others. – edmz Jun 30 '16 at 13:34

score 23 · Answer 3 · edited Jun 29 '16 at 19:44

Now my question is, why does this type of code,
int a = 5; 
int *b = &a; 
int *c = &b; 
generate a warning?

You need to go back to the fundamentals.

variables have types
variables hold values
a pointer is a value
a pointer refers to a variable
if p is a pointer value then *p is a variable
if v is a variable then &v is a pointer

And now we can find all the mistakes in your posting.

Then assume that now I want to save the address of pointer *b

No. *b is a variable of type int. It is not a pointer. b is a variable whose value is a pointer. *b is a variable whose value is an integer.

**c refers to the address of *b.

NO NO NO. Absolutely not. You have to understand this correctly if you are going to understand pointers.

*b is a variable; it is an alias for the variable a. The address of variable a is the value of variable b. **c does not refer to the address of a. Rather, it is a variable that is an alias for variable a. (And so is *b.)

The correct statement is: the value of variable c is the address of b. Or, equivalently: the value of c is a pointer that refers to b.

How do we know this? Go back to the fundamentals. You said that c = &b. So what is the value of c? A pointer. To what? b.

Make sure you fully understand the fundamental rules.

Now that you hopefully understand the correct relationship between variables and pointers, you should be able to answer your question about why your code gives an error.

I think every time OP says *b or **c in the post, they're really trying to just say b and c. You cover that a lot, but don't really cover the last part of the question ("why can't int* point to another int*"), which is what OP is really trying to ask IMO. — mbrig, Jun 29 '16 at 20:06
Whose genius idea was it to make the dereference operator and the 'is a pointer' qualifier the same symbol, anyways... — mbrig, Jun 29 '16 at 20:10
@mbrig: It was Dennis Ritchie's genius idea. If it is not clear *why* this is a genius idea, you're not mentally parsing the language properly. When we say `int * b;` we are not *merely* saying "`b` is a variable of type `int*`. **We are also saying that `*b` is a variable of type `int`**. The genius idea here is that you can mentally consider it as both `int* b` and `int *b`, and either interpretation is correct. — Eric Lippert, Jun 29 '16 at 20:37
yeah, I looked at some linked questions, and that actually made it make sense to me, for the first time ever. I still somewhat question it though. — mbrig, Jun 29 '16 at 21:24

2501 · Answer 4 · 2016-06-28T13:25:47.000

The type system of C requires this, if you want to get a correct warning and if you want the code to compile at all. With only one level of depth of pointers you wouldn't know if the pointer is pointing to a pointer or to an actual integer.

If you dereference a type int** you know the type you get is int* and similarly if you dereference int* the type is int. With your proposal the type would be ambiguous.

Taking from your example, it is impossible to know whether c points to a int or int*:

c = rand() % 2 == 0 ? &a : &b;

What type is c pointing to? The compiler doesn't know that, so this next line is impossible to perform:

*c;

In C all type information is lost after compiling, as every type is checked at compile-time and isn't needed anymore. Your proposal would actually waste memory and time as every pointer would have to have additional runtime information about the types contained in pointers.

score 17 · Answer 5 · answered Jun 28 '16 at 14:41

Pointers are abstractions of memory addresses with additional type semantics, and in a language like C type matters.

First of all, there's no guarantee that int * and int ** have the same size or representation (on modern desktop architectures they do, but you can't rely on it being universally true).

Secondly, the type matters for pointer arithmetic. Given a pointer p of type T *, the expression p + 1 yields the address of the next object of type T. So, assume the following declarations:

char  *cp     = 0x1000;
short *sp     = 0x1000;  // assume 16-bit short
int   *ip     = 0x1000;  // assume 32-bit int
long  *lp     = 0x1000;  // assume 64-bit long

The expression cp + 1 gives us the address of the next char object, which would be 0x1001. The expression sp + 1 gives us the address of the next short object, which would be 0x1002. ip + 1 gives us 0x1004, and lp + 1 gives us 0x1008.

So, given

int a = 5;
int *b = &a;
int **c = &b;

b + 1 gives us the address of the next int, and c + 1 gives us the address of the next pointer to int.

Pointer-to-pointers are required if you want a function to write to a parameter of pointer type. Take the following code:

void foo( T *p )    
{
  *p = new_value(); // write new value to whatever p points to
}

void bar( void )
{
  T val;
  foo( &val );     // update contents of val
}

This is true for any type T. If we replace T with a pointer type P *, the code becomes

void foo( P **p )    
{
  *p = new_value(); // write new value to whatever p points to
}

void bar( void )
{
  P *val;
  foo( &val );     // update contents of val
}

The semantics are exactly the same, it's just the types that are different; the formal parameter p is always one more level of indirection than the variable val.

Support Ukraine · Answer 6 · 2016-06-28T14:02:12.640

I think there should be no hierarchy if the address we are going to save refers variable, pointer, double pointer

Without the "hierarchy" it would be very easy to generate UB all over without any warnings - that would be horrible.

Consider this:

char c = 'a';
char* pc = &c;
char** ppc = &pc;
printf("%c\n", **ppc);   // compiles ok and is valid
printf("%c\n", **pc);    // error: invalid type argument of unary ‘*’

The compiler gives me an error and thereby it helps me to know that I have done something wrong and I can correct the bug.

But without "hierarchy", like:

char c = 'a';
char* pc = &c;
char* ppc = &pc;
printf("%c\n", **ppc);   // compiles ok and is valid
printf("%c\n", **pc);    // compiles ok but is invalid

The compiler can't give any error as there are no "hierarchy".

But when the line:

printf("%c\n", **pc);

executes, it is UB (undefined behavior).

First *pc reads the char as if it was a pointer, i.e. probably reads 4 or 8 bytes even though we only reserved 1 byte. That is UB.

If the program didn't crash due to the UB above but just returned some garbish value, the second step would be to dereference the garbish value. Once again UB.

Conclusion

The type system helps us to detect bugs by seeing int*, int**, int***, etc as different types.

glglgl · Answer 7 · 2016-06-28T13:14:37.677

If the purpose of pointer is just to save the memory address, I think there should be no hierarchy if the address we are going to save refers variable, pointer, double pointer, ... etc. so below type of code should be valid.

I think here is your misunderstanding: The purpose of the pointer itself is to store the memory address, but a pointer usually as well has a type so that we know what to expect at the place it points to.

Especially, unlike you, other people really want to have this kind of hierarchy so as to know what to do with the memory contents which is pointed to by the pointer.

It is the very point of C's pointer system to have type information attached to it.

If you do

int a = 5;

&a implies that what you get is a int * so that if you dereference it is an int again.

Bringing that to the next levels,

int *b = &a;
int **c = &b;

&b is a pointer as well. But without knowing what hides behind it, resp. what it points to, it is useless. It is important to know that dereferencing a pointer reveals the type of the original type, so that *(&b) is an int *, and **(&b) is the original int value we work with.

If you feel that in your circumstances there should be no hierarchy of types, you can always work with void *, although the direct usability is quite limited.

Jean-Baptiste Yunès · Answer 8 · 2016-06-28T15:40:38.947

If the purpose of pointer is just to save the memory address, I think there should be no hierarchy if the address we are going to save refers variable, pointer, double pointer, ... etc. so below type of code should be valid.

Well that's true for the machine (after all roughly everything is a number). But in many languages variables are typed, means that the compiler can then ensure that you use them correctly (types impose a correct context on variables)

It is true that a pointer to pointer and a pointer (probably) use the same amount of memory to store their value (beware this is not true for int and pointer to int, the size of an address is not related to the size of a house).

So if you have an address of an address you should use as is and not as a simple address because if you access the pointer to pointer as a simple pointer, then you would be able to manipulate an address of int as if it is a int, which is not (replace int without anything else and you should see the danger). You may be confused because all of this are numbers, but in everyday life you don't: I personally make a big difference in $1 and 1 dog. dog and $ are types, you know what you can do with them.

You can program in assembly and make what you want, but you will observe how dangerous it is, because you can do almost what you want, especially weird things. Yes modifying an address value is dangerous, suppose you have an autonomous car that should deliver something at an address expressed in distance: 1200 memory street (address) and suppose in that street houses are separated by 100ft (1221 is a non valid address), if you are able to manipulate addresses as you like as integer, you would be able to try to deliver at 1223 and let the packet in the middle of the pavement.

Another example could be, house, address of the house, entry number in an address book of that address. All of these three are different concepts, different types...

It's not necessarily true for the machine either. In older systems (and some modern embedded systems), you had different kinds of pointers - for example, the x86 architecture has near (16-bit) and far (32-bit) pointers. C hides (abstracts away) this fact from you, but it's crucial for the application to run on an x86 computer in 16-bit mode. There's other examples too, for example in segmented mode (segment + offset, where a null pointer isn't zero in actual machine code). C doesn't have a lot of high level abstractions, but it has many low level abstractions - that was the whole purpose of C. — Luaan, Jun 29 '16 at 13:43
@Luaan: There are many machines where pointers to functions and pointers to data are different; machines where the target type of a pointer affects its size exist but are much rarer. It would be useful if C defined an optional "pointer to any kind of pointer to data" type which would be defined on implementations where all kinds of data used the same representation, since at present the only way to write a function that can work with all such pointers (e.g. for purposes of sorting) is to manipulate them using memcpy, memmove, or character types, and using any of those techniques... — supercat, Jun 29 '16 at 15:12
...would require a compiler to treat the double-indirect pointers as potentially aliasing everything of every type (including characters, integers, and floating-point values) rather than just pointers, but the authors of the C Standard have traditionally been loath to define anything which couldn't be handled on all implementations. — supercat, Jun 29 '16 at 15:13
All of this is perfectly right, I know, it is sometimes necessary to simplify things. I would have mean that in raw machines things are less strict than in languages, and that we can't think as in raw machine when in a language. — Jean-Baptiste Yunès, Jun 29 '16 at 17:49

score 9 · Answer 9 · answered Jun 28 '16 at 19:17

There are different types. And there is a good reason for it:

Having …

int a = 5;
int *b = &a;
int **c = &b;

… the expression …

*b * 5

… is valid, while the expression …

*c * 5

makes no sense.

The big deal is not, how pointers or pointers-to-pointers are stored, but to what they refer.

score 9 · Answer 10 · answered Jun 29 '16 at 17:30

The C language is strongly typed. This means that, for every address, there is a type, which tells the compiler how to interpret the value at that address.

In your example:

int a = 5;
int *b = &a;

The type of a is int, and the type of b is int * (read as "pointer to int"). Using your example, the memory would contain:

..... memory address ...... value ........ type
a ... 0x00000002 .......... 5 ............ int
b ... 0x00000010 .......... 0x00000002 ... int*

The type is not actually stored in memory, it's just that the compiler knows that, when you read a you'll find an int, and when you read b you'll find the address of a place where you can find an int.

In your second example:

int a = 5;
int *b = &a;
int **c = &b;

The type of c is int **, read as "pointer to pointer to int". It means that, for the compiler:

c is a pointer;
when you read c, you get the address of another pointer;
when you read that other pointer, you get the address of an int.

That is,

c is a pointer (int **);
*c is also a pointer (int *);
**c is an int.

And the memory would contain:

..... memory address ...... value ........ type
a ... 0x00000002 .......... 5 ............ int
b ... 0x00000010 .......... 0x00000002 ... int*
c ... 0x00000020 .......... 0x00000010 ... int**

Since the "type" is not stored together with the value, and a pointer can point to any memory address, the way the compiler knows the type of the value at an address is basically by taking the pointer's type, and removing the rightmost *.

By the way, that's for a common 32-bit architecture. For most 64-bit architectures, you'll have:

..... memory address .............. value ................ type
a ... 0x0000000000000002 .......... 5 .................... int
b ... 0x0000000000000010 .......... 0x0000000000000002 ... int*
c ... 0x0000000000000020 .......... 0x0000000000000010 ... int**

Addresses are now 8 bytes each, while an int is still only 4 bytes. Since the compiler knows the type of each variable, it can easily deal with this difference, and read 8 bytes for a pointer and 4 bytes for the int.

score 6 · Answer 11 · edited Jun 30 '16 at 01:33

Why does this type of code generate a warning?
int a = 5;
int *b = &a;   
int *c = &b;

The & operator yields a pointer to the object, that is &a is of type int * so assigning (through initialization) it to b which is also of type int * is valid. &b yields a pointer to object b, that is &b is of type pointer to int *, i .e., int **.

C says in the constraints of the assignment operator (which hold for the initialization) that (C11, 6.5.16.1p1): "both operands are pointers to qualified or unqualified versions of compatible types". But in the C definition of what is a compatible type int ** and int * are not compatible types.

So there is a constraint violation in the int *c = &b; initialization which means a diagnostic is required by the compiler.

One of the rationale of the rule here is there is no guarantee by the Standard that the two different pointer types are the same size (except for void * and the character pointer types), that is sizeof (int *) and sizeof (int **) can be different values.

score 4 · Answer 12 · answered Jun 29 '16 at 20:33

That would be because any pointer T* is actually of type pointer to a T (or address of a T), where T is the pointed-to type. In this case, * can be read as pointer to a(n), and T is the pointed-to type.

int     x; // Holds an integer.
           // Is type "int".
           // Not a pointer; T is nonexistent.
int   *px; // Holds the address of an integer.
           // Is type "pointer to an int".
           // T is: int
int **pxx; // Holds the address of a pointer to an integer.
           // Is type "pointer to a pointer to an int".
           // T is: int*

This is used for dereferencing purposes, where the dereference operator will take a T*, and return a value whose type is T. The return type can be seen as truncating the leftmost "pointer to a(n)", and being whatever's left over.

  *x; // Invalid: x isn't a pointer.
      // Even if a compiler allows it, this is a bad idea.
 *px; // Valid: px is "pointer to int".
      // Return type is: int
      // Truncates leftmost "pointer to" part, and returns an "int".
*pxx; // Valid: pxx is "pointer to pointer to int".
      // Return type is: int*
      // Truncates leftmost "pointer to" part, and returns a "pointer to int".

Note how for each of the above operations, the dereference operator's return type matches the original T* declaration's T type.

This greatly aids both primitive compilers and programmers in parsing a pointer's type: For a compiler, the address-of operator adds a * to the type, the dereference operator removes a * from the type, and any mismatch is an error. For a programmer, the number of *s is a direct indication of how many levels of indirection you're dealing with (int* always points to int, float** always points to float* which in turn always points to float, etc.).

Now, taking this into consideration, there are two major issues with only using a single * regardless of the number of levels of indirection:

The pointer is much more difficult for the compiler to dereference, because it has to refer back to the most recent assignment to determine the level of indirection, and determine the return type appropriately.
The pointer is more difficult for the programmer to understand, because it's easy to lose track of how many layers of indirection there are.

In both cases, the only way to determine the value's actual type would be to backtrack it, forcing you to look somewhere else to find it.

void f(int* pi);

int main() {
    int x;
    int *px = &x;
    int *ppx = &px;
    int *pppx = &ppx;

    f(pppx);
}

// Ten million lines later...

void f(int* pi) {
    int i = *pi; // Well, we're boned.
    // To see what's wrong, see main().
}

This... is a very dangerous problem, and one that is easily solved by having the number of *s directly represent the level of indirection.

Pointers to pointers vs. normal pointers

12 Answers12

Linked