4

Reading this Skip List implementation I came across this code fragment:

typedef struct nodeStructure{
    keyType key;
    valueType value;
    node forward[1]; /* variable sized array of forward pointers */
    };

To me it seems that forward[1] denotes a one element array. And the comment calls it a variable sized array.

Do I misunderstand something or this is just a mistake in the source I'm reading?

ovgolovin
  • 13,063
  • 6
  • 47
  • 78

4 Answers4

5

It is called the struct hack. It is the old form of the flexible array member introduced in C99.

This has been used in the past to mimic a variable array in the last member of a structure but it is not a strictly conformning construct in C.

ouah
  • 142,963
  • 15
  • 272
  • 331
3

This is a program paradigm in C that you will see sometimes. When allocating the structure, you will allocate sizeof(struct nodeStructure + numNodes * sizeof(node)).

This allows you to have multiple forward nodes for the struct, even though it is only declared to have one. It's a bit of an ugly hack, but it works.

Typically, when you do this, there will also be a filed called 'count' or something, so that you know how many extra entries are after the node.

samoz
  • 56,849
  • 55
  • 141
  • 195
  • Have spent a few minutes trying to find count. But haven't. Eventually it dawned on me that values abouve `numNodes` won't be accessed at all (because of how pointers to this node are kept: this node will be accessed only from the levels of the skip list that itself has as well in the array). – ovgolovin Nov 08 '12 at 20:23
2

This is a common trick for the older C compilers (before C99): compilers allowed you to dereference elements past the end of forward's declared length when it is the last element of the struct; you could then malloc enough memory for the additional node elements, like this:

nodeStructure *ptr = malloc(sizeof(nodeStructure)+4*sizeof(node));
for (int i = 0 ; i != 5 ; i++) { // The fifth element is part of the struct
    ptr->forward[i] = ...
}
free(ptr);

The trick lets you embed arrays of variable size in a structure without a separate dynamic allocation. An alternative solution would be to declare node *forward, but then you'd need to malloc and free it separately from the nodeStructure, unnecessarily doubling the number of mallocs and potentially increasing memory fragmentation:

Here is how the above fragment would look without the hack:

typedef struct nodeStructure{
    keyType key;
    valueType value;
    node *forward;
};

nodeStructure *ptr = malloc(sizeof(nodeStructure));
ptr->forward = malloc(5*sizeof(node));
for (int i = 0 ; i != 5 ; i++) {
    ptr->forward[i] = ...
}
free(ptr->forward);
free(ptr);

EDIT (in response to comments by Adam Rosenfield): C99 lets you define arrays with no size, like this: node forward[]; This is called flexible array member, it is defined in the section 6.7.2.1.16 of the C99 standard.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • @What is the use of such an obscure syntax? Does it give any benefits? – ovgolovin Nov 08 '12 at 20:04
  • @ovgolovin Please see the edit, I added an info on the benefit of this trick (fewer allocations). – Sergey Kalinichenko Nov 08 '12 at 20:07
  • Please, could you write how allocation can be done via `node *forward` (if it's not too long). I think it will be helpful not only for me, but for the other newbies who are learning C. – ovgolovin Nov 08 '12 at 20:09
  • @ovgolovin Sure, no problem - please take a look. – Sergey Kalinichenko Nov 08 '12 at 20:12
  • 1
    The C standard lets you do this? I always thought it was a hack that only worked because compilers arranged structs so similarly. Since `forward` is defined as a 1-element array, `forward[10]` is outside the array and thus should technically trigger UB. – cHao Nov 08 '12 at 20:12
  • @cHao Yes, `forward[10]` was and is undefined behavior, but the nineties used to be such simpler times. – Pascal Cuoq Nov 08 '12 at 20:14
  • It's worth noting that C99 added a feature called the *flexible array member* which is very similar, except you declare the array with indeterminate size: `node forward[];` – Adam Rosenfield Nov 08 '12 at 20:21
  • @AdamRosenfield Thanks, Adam, I edited the answer to include the information from your comment. – Sergey Kalinichenko Nov 08 '12 at 20:29
  • Could you also write a code which would use *flexible array member*? Because it seems to me `nodeStructure *ptr = malloc(sizeof(nodeStructure)+4*sizeof(node))` could be replaced with another syntax to declare the array size, or will it still be done through tinkering with `malloc`? – ovgolovin Nov 08 '12 at 20:31
  • @ovgolovin Yes, you use the same trick, except you do not subtract one from the intended array size (i.e. for a five-element array you use `malloc(sizeof(nodeStructure)+5*sizeof(node))`, not `malloc(sizeof(nodeStructure)+4*sizeof(node))`. Prior to flexible arrays it was very confusing (I made this mistake in my first answer, using 5 instead of 4, and corrected it only in my last edit). – Sergey Kalinichenko Nov 08 '12 at 20:37
  • @dasblinkenlight: By my understanding of the C standard, the struct hack relies upon Undefined Behavior, since given a struct member of e.g. `char foo[2]` the maximum amount one is allowed to add to `&foo` would be either 2 (the declared size) or ([end of allocated space] - `foo`), *whichever is less*; such a rule would allow a compiler to e.g. replace `foo[expr]` with `expr ? foo[1] : foo[0]` if the latter operation would be faster. In practice, I suspect compilers would disable any such optimizations with indirectly-accessed structs that end with a single-element array, but... – supercat Dec 05 '12 at 16:09
  • ...I don't know of anything in any standard that would require them to do so. Specifying a size which was larger than one would ever need would seem more legitimate, but some compilers won't allow the declaration of structs beyond a certain size (e.g. 64K) whether or not any "full-sized" instances would ever be created. Some older compilers allow an array size of 0; I wonder why the standard didn't simply follow their lead (a zero-sized array could be useful not only as a flexible-array member, but also to enforce alignment of the next member and alias a pointer to it). – supercat Dec 05 '12 at 16:16
2

The data structure implementation is most likely written against the C90 standard, which did not have flexible array members (added in C99). At that time, it was common to use a 1- or even 0-sized(*) array at the end of a struct to allow access to a dynamically variable number of elements there.

The comment should not be interpreted as meaning C99-style variable length arrays; besides, in C99, the idiomatic and standard-conformant definition for member forward would be node forward[];. A type such as struct nodeStructure with such a member is then called an incomplete type. You can define a pointer to it, but you cannot define a variable of this type or take its size, all operations that node forward[0] or node forward[1] allow, although these operations arguably mismatch the programmer's intent.

(*) 0-sized arrays are forbidden by the standard but GCC accepted these as an extension for precisely this use.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281