17

As a non-C/C++ expert I always considered square brackets and pointers arrays as equal.

ie :

char *my_array_star;
char my_array_square[];

But I noticed that when use in a structure/class they don't behave the same :

typedef struct {
   char whatever;
   char *my_array_star;
} my_struct_star;

typedef struct {
   char whatever;
   char my_array_square[];
} my_struct_square;

The line below displays 16, whatever takes 1 byte, my_array_pointer takes 8 bytes. Due to the padding the total structure size is 16.

printf("my_struct_star: %li\n",sizeof(my_struct_star));

The line below displays 1, whatever takes 1 byte, my_array_pointer isn't taken in account.

printf("my_struct_square: %li\n",sizeof(my_struct_square));

By playing around I noticed that square brackets are used as extra space in the structure

my_struct_square  *i=malloc(2);

i->whatever='A';
i->my_array_square[0]='B';

the line blow displays A:

printf("i[0]=%c\n",((char*)i)[0]);

the line blow displays B:

printf("i[1]=%c\n",((char*)i)[1]);

So I cannot say anymore that square brackets are equals to pointers. But I'd like to understand the reason of that behavior. I'm afraid of missing a key concept of that languages.

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Raphael
  • 183
  • 1
  • 1
  • 7

2 Answers2

28

Arrays and pointers don't behave the same because they're not the same at all, it just seems that way.

Arrays are a group of contiguous items while a pointer is ... well ... a pointer to a single item.

That single item being pointed to may well be the first in an array so that you can access the others as well, but the pointer itself neither knows nor cares about that.

The reason that arrays and pointers often seem to be identical is that, in many cases, an array will decay to a pointer to the first element of that array.

One of the places this happens is in function calls. When you pass an array to a function, it decays into a pointer. That's why things like the size of an array don't pass through to the function explicitly. By that I mean:

#include <stdio.h>

static void fn (char plugh[]) {
    printf ("size = %d\n", sizeof(plugh)); // will give char* size (4 for me).
}

int main (void) {
    char xyzzy[10];
    printf ("size = %d\n", sizeof(xyzzy)); // will give 10.
    fn (xyzzy);

    return 0;
}

The other thing you'll find is that, while you can plugh++ and plugh-- to your hearts content (as long as you don't dereference outside of the array), you can't do that with the array xyzzy.

In your two structures, there's a major difference. In the pointer version, you have a fixed size pointer inside the structure, which will point to an item outside of the structure.

That's why it takes up space - your 8-byte pointer is aligned to an 8-byte boundary as follows:

+----------------+
| 1 char variable|
+----------------+
| 7 char padding |
+----------------+
| 8 char pointer |
+----------------+

With the "unbounded" array, you have it inside the structure and you can make it as big as you want - you just have to allocate enough memory when you create the variable. By default (ie, according to the sizeof), the size is zero:

+----------------+
| 1 char variable|
+----------------+
| 0 char array   |
+----------------+

But you can allocate more space, for example:

typedef struct {
   char whatever;
   char my_array_square[];
} my_struct_square;

my_struct_square twisty = malloc (sizeof (my_struct_square) + 10);

gives you a variable twisty which has a whatever character and an array of ten characters called my_array_square.

These unbounded arrays can only appear at the end of a structure and there can be only one (otherwise the compiler would have no idea where these variable length section began and ended) and they're specifically to allow arbitrarily sized arrays at the end of structures.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Thank you for this top quality answer, congrats! – Raphael Feb 17 '12 at 09:05
  • +1 excellent explanation. I'm not sure if unsized member arrays are even allowed in the standards. I usually get a warning about using a non-standard extension if I do this. – Kevin Nov 23 '13 at 14:26
  • "Arrays decay to simple pointers when passed to a function": that's the reason I always prefer using pointer syntax in function arguments `void fn (char *foobar)` over array syntax `void fn (char foobar[])` – Flimm Jul 29 '14 at 15:33
5

The my_array_square member is what is called a "flexible" array member. Such arrays without a specified size can only appear at the end of a struct, and they don't contribute to its size. The intent is to manually allocate the rest of the space for as much elements as you need. Otherwise, the size of the array is determined at compile-time.

The usage pattern of such a struct would be as follows:

my_struct_square *s = malloc(sizeof(my_struct_square) + 5 * sizeof(char));
...
s->my_array_square[4]; // the last element of the array

In all other cases, the size of an array must be known at compile-time. Even the type of an array goes together with its size, i.e., int a[20] is of type int[20], not just int[].

Also, understanding the difference between arrays and pointers is crucial. @paxdiablo has covered that quite well.

Blagovest Buyukliev
  • 42,498
  • 14
  • 94
  • 130
  • Note that flexible array members are only allowed in C language version C99 or later. If code like this is attempted in C90 ("ANSI C") then I believe you can end up with crashes when writing data to the possible padding bytes that may be stored at the end of the struct, if the data written is a "trap representation". Please correct me if I'm wrong. – Lundin Feb 17 '12 at 10:08
  • @Lunding: I guess it shouldn't compile at all with a C90 compiler. – Blagovest Buyukliev Feb 17 '12 at 10:47
  • Minor nitpick: Not in all other cases must the array size be known at compile time since C99 introduced variable length arrays. – Daniel Fischer Feb 17 '12 at 15:12