3

I've been learning about linked lists and the recursive definition of the node struct has been bugging me

struct node {
    struct node *next;
    int data;
};

I guess I've always imagined that since a pointer is typed, it knows both the beginning address and the amount of memory it can access when it's dereferenced, at the time of declaration. But it can't possibly, since it's declared before an arbitrary amount of other variables which can make the struct of any size. Does it figure it out only when dereferenced, or is there some sort of memory table that gets filled at the end of the struct definition and before the pointer could be used?

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
nek28
  • 259
  • 1
  • 2
  • 6
  • The size of that struct is easy to calculate. It's just 1 pointer and an int. It's all done at compile time. – byxor Jan 18 '17 at 20:45
  • 1
    What the pointer points to has nothing to do with the struct itself. Perhaps that is confusing you. – DeiDei Jan 18 '17 at 20:47
  • 1
    The struct has a definition, thus it has a (fixed) size. `sizeof (struct node)` is a constant, which is already available at compile time. An object of type `struct node` will have exactly the same type and size. a pointer to such an object **,when dereferenced,** will have the same size, too. – wildplasser Jan 18 '17 at 20:49
  • A pointer is typed, but you can cast it to another type. For example, you can convert it to a generic pointer: `struct node *nodep = &mynode; void *ptr = (void*)nodep;`. – DYZ Jan 18 '17 at 20:49
  • 1
    *"But it can't possibly, since it's declared before an arbitrary amount of other variables which can make the struct of any size."* First, it doesn’t have to all be done in one pass; the compiler could leave the size of `struct node` undecided and then go back and change it… if it needed to store that information in the field, which it probably doesn’t. It can just store the type of the field and look the type up when necessary instead. This is kind of hard to explain since it’s really up to the compiler what to do as long as it follows specification. – Ry- Jan 18 '17 at 20:54

4 Answers4

8

Pointer is just a single value, it holds a single address in memory.

It is the compiler that knows the size of structures and offsets of fields in those structures. Whenever you access a field in a referenced structure, it adds an offset.

Look at the following program:

struct X {
    char a;
    int  b;
    long c;
};

void y() {
    struct X x;
    x.a = 42;
    x.b = 43;
    x.c = 44;
}

Function y is translated to the following assembler code (gcc -s):

y:  
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movb    $42, -16(%rbp)
    movl    $43, -12(%rbp)
    movq    $44, -8(%rbp)
    nop
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

You can clearly see values 42, 43, 44. Compiler calculated offsets of the fields in a structure x. They are relative to the stack pointer (rbp) because the value x is allocated on the stack.

Grzegorz Żur
  • 47,257
  • 14
  • 109
  • 105
1

At the time of structure declaration, (with a member as a pointer to the same type) it is only required to have knowledge about the type, i.e, a pointer to type. Maybe at that point, the type is not complete, but a pointer to the type is known for that platform.

In other words, this is the same reason you cannot have a member of that structure type inside a structure declaration (at that point, the structure is not complete) but can have a pointer to the type.

Once you make the pointer member to point to some valid memory, then it knows the beginning address, too. Then, upon the dereference, it calculates the offset address and gets the value from that location, based on the previously mentioned "type".

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
1

Edit

Derp. Now I understand the question you're actually asking.

There are two parts to this. First, the compiler allows you to create pointers to "incomplete" types, where the size isn't yet known. Secondly, all pointers to struct types have the same size and representation, regardless of the size of the actual struct type.

Going by your example:

struct node {
    struct node *next;
    int data;
};

When the compiler sees the declaration for next, the type struct node is incomplete - the compiler doesn't yet know how big struct node will be. However, at this point in the process, it doesn't need to know that size in order for you to declare a pointer to that type. You're not yet at a point where the compiler needs to know sizeof *next.

The type definition is complete when the compiler sees the closing }; of the struct definition - at that point, the compiler knows how big the struct node type actually is.

Original

The compiler knows the size of the pointed-to type, so given a pointer p, the expression p + 1 will yield the address of the next object of the pointed-to type.

Given

int    *ip = 0x1000; // 4 bytes
char   *cp = 0x1000; // 1 byte
double *dp = 0x1000; // 8 bytes

the expression ip + 1 will yield the address of the next 4-byte int object, or 0x1004, cp + 1 will yield the address of the next 1-byte char object, or 0x1001, and dp + 1 will yield the address of the next 8-byte double object, or 0x1008.

The pointer itself points to a single object, period. It has no way of knowing whether the object it's pointing to is part of a sequence, or how large any such sequence would be.

John Bode
  • 119,563
  • 19
  • 122
  • 198
0

Every pointer have same size depending on your system (4 or 8 bytes as far as I had seen).

So when you type such a struct

struct node {
   struct node *next; // 4 or 8 bytes 
   int data; // 4 or 8 bytes
};                           

the compiler knows that it's exact size.

But try this way instead and see on your own.

//Wrong declaration
struct node {
   struct node next; // The compiler cannot decide structure's size
   int data; // 4 or 8 bytes
}; 

This will give a compile error.

  • While it's true that pointers to different types typically do have the same sizes and representations on modern desktop and server platforms, that's not guaranteed; the language allows pointers to different types to have different sizes and representations. – John Bode Jan 18 '17 at 21:56
  • The 4 or 8 bytes for int and for pointers is an optimistic bet at best. It's neither guaranteed (take a 16bit microprocessor architecture and enjoy) nor does it correlate (eg. on Win64 with Visual Studio, int is 4 byte while pointer is 8 byte). I didn't know about the possibility of different pointer sizes, but thats just so c... these guys wrote a language standard for compilers, not for programmers. – grek40 Jan 18 '17 at 22:27
  • @John Bode ,Do you mean that a void/integer/char/something pointer have different sizes in the same compiler or changing a single pointer's size , for example a 4 bytes integer pointer to 8 bytes pointer at runtime ? – korkutserkan Jan 18 '17 at 22:28
  • @korkutserkan: 6.2.5/28: "A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.48) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements." – John Bode Jan 18 '17 at 22:34