0

I tried to implement a form of collections-library. I do it all the time, when learning a new language, because it teaches most of the language details.

So, I started with a form of "generic" dynamic array. Well it is not really generic, because it just holds pointers to the actual data. But to be honest, I don't fully understand, why I need a double void pointer here.

The Vector struct defined in my header file (I declared every method and #include in the header file, but I omitted this here to keep the code readable. I also ommitted some bounds checks)

typedef struct {
    size_t capacity; //the allocated capacity
    size_t length; //the actual length
    void **data; //here I don't fully understand, why I need a double pointer.
} Vector;

Here is my implementation of a few methods, where the compiler complains when I use a single void pointer in my struct, so void *data instead of void **data.

#include "utils.h"

const size_t INITIAL_SIZE = 16;

//Creates a new empty vector.
Vector *vec_new(void) {
    printf("sizeof Vector is: %ld", sizeof(Vector));
    Vector *vec = malloc(sizeof(Vector));
    vec->length = 0;
    vec->capacity = INITIAL_SIZE;
    void *data = calloc(INITIAL_SIZE, sizeof(void*));
    if(data == NULL) {
        free(vec->data);
        fprintf(stderr, "Error allocating memory.");
        exit(EXIT_FAILURE);
    }
    vec->data = data;
    return vec;
}

//This method appends the specified value at the end of the vector.
void vec_push(Vector *vec, void *data) {
    if(vec->length == vec->capacity-1) {
        vec_resize(vec);
    }
    vec->data[vec->length] = data;
    vec->length += 1;
}

//gets the value at the specified index or NULL if index is out of bounds.
void *vec_get(Vector *vec, size_t index) {
    return vec->data[index];
}

//Resizes the vector to 1.5x its current capacity.
void vec_resize(Vector *vec) {
    vec->capacity *= 1.5;
    void *data = realloc(vec->data, sizeof(void*) * vec->capacity);
    if(data == NULL) {
        free(vec->data);
        fprintf(stderr, "Error allocating memory.");
        exit(EXIT_FAILURE);
    }
    vec->data = data;
}

It seems like here is where the magic happens, which i do not yet understand:

void *data = malloc(...);
vec->data = data;

Malloc/calloc return a void pointer, so i either have to declare an actual type or just using the returned void pointer. So the first line is clear. vec->data is, under the assumption I do not use a double pointer in the struct definition equivalent to (*vec).data as far as I understand it. So basically this line should assing a void pointer to a void pointer.

Can maybe someone explain it to me in simple terms, why exactly a single void pointer is not enough here or where I might misunderstand something.

user438383
  • 5,716
  • 8
  • 28
  • 43
456c526f
  • 123
  • 6
  • At a glance, `void**` seems plain wrong. `vec->data = data;` is also plain wrong and you won't get a compiler error/warning for it. – Lundin Jan 21 '22 at 09:15

2 Answers2

1

If you have a pointer of the type

T *p1;

where T is some type specifier as for example void then pointer to this pointer will be declared like

T **p2 = &p1.

In this call of calloc

calloc(INITIAL_SIZE, sizeof(void*))

you are going to allocate an array of pointers of the type void *. The function returns a pointer to the first element of the allocated array. So you need to write

void **data = calloc(INITIAL_SIZE, sizeof(void*));

To make it more clear let's assume that you need to allocate dynamically an integer array. In this case you will write

int *data = calloc( INITIAL_SIZE, sizeof( int ) );

So dereferencing the pointer data like *data you will get an object of the type int more precisely the first element of the allocated array.

When elements of the array have the type void * then dereferencing the pointer data like *data you must to get a pointer of the type void * (the first element of the allocated array). So to make the operation correct the pointer data shall have the type void **.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
1

But to be honest, I don't fully understand, why I need a double void pointer here.

Some background first - maybe you already know that:

A pointer of the type someType * is a pointer to some variable of the type someType or to an array of variables of the type someType.

A pointer of the type someType ** is a pointer to a variable of the type someType * - this means: A pointer to a pointer to a variable of the type someType.

A pointer of the type void * is a pointer to anything; because the compiler does not know to what kind of element this pointer points to, it is not possible to access such an element directly.

In contrast to this, it is known what variable a pointer of the type void ** points to: It points to a variable of the type void *.

Why you need void** in this position:

The key are the lines:

vec->data[vec->length] = data;
...
return vec->data[index];

In these lines, the code accesses the data vec->data points to. For this reason, vec->data cannot be void * but it must be xxx * while xxx is the type of data the pointer vec->data points to. And because vec->data points to a pointer of the type void *, xxx is void * so xxx * is void **.

vec->data = data;

Your observation is correct: vec->data is of the type void ** and data is of the type void *.

The reason is that malloc() returns some memory and the compiler does not know which kind of data is stored in this memory. So the value returned by malloc() is void * and not void **.

In the automotive industry, you would use an explicit pointer cast like this:

vec->data = (void **)data;

The expression (xxx *)y tells the compiler that the pointer y points to some data of the type xxx. So (void **) tells the compiler that the pointer points to an element of the type void *.

However, in desktop applications you often don't write the (void **).

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
  • "In the automotive industry, you would use an explicit pointer cast like this" No this is wrong, the use of void pointer conversions is not allowed in automotive systems to begin with. Mainly because type-generic programming doesn't make any sense in deterministic real-time systems. See MISRA-C:2012 rules 11.5 and 11.6. – Lundin Jan 21 '22 at 09:18
  • @Lundin That's right: `void *` pointers are typically avoided completely. However, I was working on low-layer software (e.g. device drivers) where it is required to "justify" some MISRA or "QA/C" warnings because accessing the DMA (as an example) cannot be done in a 100% MISRA-compliant way. In such situations, suppressing a warning (with a justification in a comment) about an explicit cast from `void *` to some other pointer was allowed in the company where I worked while implicit casts were never allowed. – Martin Rosenau Jan 21 '22 at 11:31
  • Often you can use a `uint8_t*` as a generic pointer when dealing with raw data. – Lundin Jan 21 '22 at 11:34
  • thank you for your useful comments. I'm comming from high level languages, mostly Java, so I thought it would be a good Idea to implement some form of a generic, dynamic array. Your comments made me think wheter this may not be the best idea in C. Especially because i got a job in the automotive industry here in Germany, thats the reason I work on my C skills, so im deeply interested in a good solution. Maybe I have to post the whole code on codereview.stackexchange to get a overall feedback. – 456c526f Jan 21 '22 at 11:53