How to reclaim struct correctly?

Question

I'm trying to understand what is the common idiom (good practice) to provide creation/reclamation functions of a struct. Here is what I tried:

struct test_struct_t{
    int a;
};

struct test_struct_t *create(int a){
    struct test_struct_t *test_struct_ptr = malloc(sizeof(*test_struct_ptr));
    test_struct_ptr -> a = a;
    return test_struct_ptr;
}

void release(struct test_struct_t *test_struct_ptr){
    free((void *) test_struct_ptr);
}

int main(int argc, char const *argv[])
{
    const struct test_struct_t *test_struct_ptr = create(10);
    release(test_struct_ptr); // <--- Warning here
}

I got the warning

passing argument 1 of ‘release’ discards ‘const’ qualifier from pointer 
   target type [-Wdiscarded-qualifiers]

which is clear. So I tend to define reclamation method as follows:

void release(const struct test_struct_t *test_struct_ptr){
    free((void *) test_struct_ptr);
}

The warning disappeared, but I'm not sure it is not error-prone.

So is a common practice to define struct reclamation method parameter as a pointer to a const struct so we can avoid casting to non-const any time and do this dirty cast once in the reclamation method implementation?

Typically you only make those cleanup functions when the struct has alloc'd pointer members that need to be freed first and then the pointer to the struct can be freed. It should always be the opposite of allocation order. Lastly, the dangling pointer should be set to NULL. — Rafael, Jan 18 '19 at 09:07
@Rafael But how about hiding deallocation details within the cleanup function? So we don't have to replace `free` with the custom dealloc function throughout the whole codebase in case we complicate it... — Some Name, Jan 18 '19 at 09:11
Seems to me like a bad idea at first to have only const pointers for dynamically allocated objects. — iBug, Jan 18 '19 at 09:11
@iBug Why? The reason I choose so was that pointer to non-const can be converted to pointer to const without explicit cast (This is my assumption and might be wrong, did not find any proof in the Standard). — Some Name, Jan 18 '19 at 09:13
@SomeName Yes. Your assumption is correct, but since you need to modify it later (`free(3)`), you should have at least a non-const pointer to it, to avoid warnings like this. — iBug, Jan 18 '19 at 09:16
One common practice is to use "opaque pointers" and not expose the struct to the caller at all. Giving the caller a pointer to incomplete type. In which case you don't need the `const` qualifier, since the caller can't do anything meaningful with the pointer anyhow. — Lundin, Jan 18 '19 at 09:18

score 3 · Accepted Answer · answered Jan 18 '19 at 10:55

So is a common practice to define struct reclamation method parameter as a pointer to a const struct so we can avoid casting to non-const any time and do this dirty cast once in the reclamation method implementation?

No. It is more common to not use const with dynamically allocated structures, or with structures containing pointers to dynamically allocated memory.

You only mark const things you do not intend to modify; and freeing it or data referred to by its members is a modification. Just look at how free() is declared: void free(void *), not void free(const void *).

That is the core problem in OP's code, and using struct test_struct_t *test_struct_ptr = create(10); without the const qualifier is the proper solution.

There is an interesting underlying question here, though, that I want to chew a bit on, because the wording of this question is such that those looking for answers to it will encounter this question via a web search.

How to reclaim struct correctly?

Let's look at a real-world case: a dynamically allocated string buffer. There are two basic approaches:

typedef struct {
    size_t          size;  /* Number of chars allocated for data */
    size_t          used;  /* Number of chars in data */
    unsigned char  *data;
} sbuffer1;
#define  SBUFFER1_INITIALIZER  { 0, 0, NULL }

typedef struct {
    size_t          size;  /* Number of chars allocated for data */
    size_t          used;  /* Number of chars in data */
    unsigned char   data[];
} sbuffer2;

One can declare and initialize the first version using the preprocessor initializer macro:

    sbuffer1  my1 = SBUFFER1_INITIALIZER;

This is used in e.g. POSIX.1 pthread_mutex_t mutexes and pthread_cond_t condition variables.

However, because the second one has a flexible array member, it cannot be declared statically; you can only declare pointers to it. So, you need a constructor function:

sbuffer2 *sbuffer2_init(const size_t  initial_size)
{
    sbuffer2  *sb;

    sb = malloc(sizeof (sbuffer2) + initial_size);
    if (!sb)
        return NULL; /* Out of memory */

    sb->size = initial_size;
    sb->used = 0;
    return sb;
}

which you use thus:

    sbuffer2 *my2 = sbuffer2_init(0);

although I personally implement the related functions so that you can do

    sbuffer2 *my2 = NULL;

as the equivalent of sbuffer1 my1 = SBUFFER1_INITIALIZER;.

A function that can grow or shrink the amount of memory allocated for the data, only needs a pointer to the first structure; but either a pointer to the pointer to the second structure, or return the possibly modified pointer, for the changes to be visible to the caller.

For example, if we wanted to set the buffer contents from some source, perhaps

int  sbuffer1_set(sbuffer1 *sb, const char *const source, const size_t length);

int  sbuffer2_set(sbuffer2 **sb, const char *const source, const size_t length);

Functions that only access the data but do not modify it also differ:

int  sbuffer1_copy(sbuffer1 *dst, const sbuffer1 *src);

int  sbuffer2_copy(sbuffer2 **dst, const sbuffer2 *src);

Note that the const sbuffer2 *src is not a typo. Because the function will not modify the src pointer (we could make it const sbuffer2 *const src!), it does not need a pointer to the pointer to the data, just the pointer to the data.

The really interesting part is the reclaim/free functions.

The functions to free such dynamically allocated memory do differ in one important part: the first version can trivially poison the fields to help detect use-after-free bugs:

void sbuffer1_free(sbuffer1 *sb)
{
    free(sb->data);
    sb->size = 0;
    sb->used = 0;
    sb->data = NULL;
}

The second one is tricky. If we follow the above logic, we would write a poisoning reclaim/free function as

void sbuffer2_free1(sbuffer2 **sb)
{
    free(*sb);
    *sb = NULL;
}

but because programmers are used to the void *v = malloc(10); free(v); pattern (as opposed to free(&v);!), they usually expect the function to be

void sbuffer2_free2(sbuffer2 *sb)
{
    free(sb);
}

instead; and this one cannot poison the pointer. Unless the user does the equivalent of sbuffer2_free2(sb); sb = NULL;, there is a risk of reusing the contents of sb afterwards.

The C libraries do not usually return the memory immediately to the OS, but just add it to its own internal free list, to be used by a subsequent malloc(), calloc(), or realloc() call. This means that in most situations the pointer can still be dereferenced after a free() without a runtime error, but the data it points to will be something completely different. This is what makes these bugs so nasty to reproduce and debug.

Poisoning is simply setting the structure members to invalid values, so that the use-after-free is easily detected at run time, because of the easily-seen values. Setting the pointer used to access dynamically allocated memory to NULL means that if the pointer is dereferenced, the program should crash with a segmentation fault. This is much easier to debug with a debugger; at least you can find easily exactly where and how the crash occurred.

This is not so important in self-contained code, but for library code, or for code used by other programmers, it can make a difference in the general quality of the combined code. It depends; I always judge it on a case-by-case basis, although I do tend to use the pointer-member-and-poisoning version for examples.

I've waxed and waned much more about pointer members versus flexible array members in this answer. It might be interesting to those wondering about how to reclaim/free structures, and how to choose which type (pointer member or flexible array member) to use in various cases.

I'm curious about your example `void sbuffer1_free`. The thing is you first `free`ed the `sb -> data` making `sb -> data` pointer dangling. Would it be better to first read the pointer to a local variable then set it to `NULL` and only then `free`. I mean `unsigned char *tmp = sb -> data; sb -> data = NULL; free(tmp);`. This way we never make `sb` points to an object containing a dangling pointer. — Some Name, Jan 19 '19 at 07:01
@SomeName: No, the temporary variable does not make any difference, really. The only case where it would matter is if another thread accesses the buffer while it is being freed. (In all other cases, the freeing, with or without the temporary variable, will act as if an atomic operation.) It does not help with the other thread case at all, because the other thread is almost certainly using a copy of the pointer anyway. (And the race condition itself is really uninteresting, because the result, no matter who wins, is unusable.) — Nominal Animal, Jan 19 '19 at 07:26
@SomeName: I can see why it would *seem* better to use a temporary variable to do so. It is important to realize that if there is a race condition between the accesses here, one order will always lead to garbage or runtime error. Therefore, minimizing the window is not important: you simply must avoid such races altogether, to ensure correct operation. — Nominal Animal, Jan 19 '19 at 07:30
Since struct reclamation is a modification (as you mentioned in the answer) and we want to allocate a struct on a heap and later release it we should not use `const`. By contrast if we want to preserve struct immutable somewhere we can define a function accepting `const struct *` and do some necessary actions where immutability would matter. Is it a common use-case? — Some Name, Jan 20 '19 at 09:46
@SomeName: It is extremely common for functions to define a parameter as `const struct foo *` when the function will not attempt to modify its contents, even when the structure itself is assumed to be mutable in the caller. An excellent example of this is the [`nanosleep()`](http://man7.org/linux/man-pages/man2/nanosleep.2.html) POSIX.1 function. (The POSIX.1 [spec](http://pubs.opengroup.org/onlinepubs/9699919799/functions/nanosleep.html) explicitly mention the two pointers can point to the same object as well, so this is a case where aliasing is explicitly allowed as well.) — Nominal Animal, Jan 20 '19 at 13:41
@SomeName: While compilers today are smart enough to infer `const`ness themselves, it is useful for us human programmers as well. It helps us understand how the function is supposed to operate. In the `nanosleep()` case, the first parameter, `const struct timespec *req` points to the requested duration of the sleep, and the `struct timespec *rem` is either NULL or points to the duration unslept, if interrupted. Typical usecase is then `struct timespec t; t.tv_sec = 5; t.tv_nsec = 500000000; nanosleep(&t, NULL);` or ... — Nominal Animal, Jan 20 '19 at 13:45
`struct timespec treq, trem; treq.tv_sec = 5; treq.tv_nsec = 500000000; if (nanosleep(&treq, &trem) == -1 && errno == EINTR) { /* sleep interrupted, time unslept is ((double)trem.tv_sec + (double)trem.tv_nsec / 1000000000.0) seconds */ } else { /* Slept 5.5 seconds or tiny bit longer */ }`. In other words, the `const` marking does not affect the calling code at all, but it does help us understand the internal operation better, compared to, say, having just a `struct timespec *req` and an `int save_unslept` flag as parameters. — Nominal Animal, Jan 20 '19 at 13:50
I have one more question... Is it common to cast against `const` in the reclamation method in case of "opaque struct". I mean when declaring "opaque struct" and provide defenition to it in the corresponding `C`-file. In header I have opaque `struct test_t; void release(struct test_t *);` in C `struct test_t{ int a; const char *str; }; void release(struct test_t *test_ptr){ free((char *)(test_ptr -> str)); free(test_ptr);}`. I casted `const char *` to `char*`. I think it is safe to do so because we complitely hide the details of `struct test_t` in the C file... — Some Name, Jan 21 '19 at 08:28
@SomeName: It is safe, yes. But, why would you mark it `const` if it is opaque, and only used in your own (i.e. library internal) C code? — Nominal Animal, Jan 21 '19 at 11:45

How to reclaim struct correctly?

1 Answers1