Deallocating a data structure with a void* value

Question

I'm a beginner in C programming, and I'm sure this question is done and solved, but I cannot find much of an answer looking around. In my program, I have this implementation of a BST (Binary Search Tree) node:

typedef struct bstnode{
    char *key;
    void *value;
    struct bstnode *left;
    struct bstnode *right;
}bstnode;

In which I need value to be void* to utilize this same structure for different purposes. My fear, however, is that in doing so I am allowing for more possibilities of error and insability in my program. For example: in many cases, key points to a value that lives in the stack. Which is, well, fine. However, sometimes I need to perform dynamic allocation for key. And sometimes, value is also dynamically allocated. Which means that for every tree I create, I have to perform different deallocation routines (and, if the value is another dynamically allocated struct, I also need to make sure to deallocate that struct properly). Of course, I can keep track of which structure is which, but I fear that this might get out of hand and lead to memory leaks, and also to a less "mantainable" codebase (ex. having to make specific deallocation functions for specific use cases). My question is: is what I am doing bad practice? Is there a better way to go about it than having to write functions for every use case of the structure?

Could you include information in these "value"s so they can self-identify? — Scott Hunter, May 31 '23 at 12:50
I mean, I could replace value with a struct that has members char *type and void *value and have a function that deallocs things based on the type. But I'm also concerned about having to differentiate heap and stack values. — randomdood1923, May 31 '23 at 12:55
The rule of thumb is that whoever allocated the memory is also in charge of deleting it. So in case your struct doesn't allocate memory for this pointer, then it shouldn't be freeing it either. It's really that simple. — Lundin, May 31 '23 at 12:55
`than having to write functions for every use case of the structure` do you know function pointers? You can pass a deallocation function pointer when deallocating the tree, or you can store a free function pointer alongside the value, or you can store a free function pointer in the tree root. — KamilCuk, May 31 '23 at 12:56
@Lundin I see. What about passing heap pointers to threads? Sometimes I don't want to keep the parent thread around or to keep track of every pointer that another thread has gotten hold of, and I was relying on the child thread deallocating the resource. Should I just keep the parent thread around and deallocate the memory when the thread is done from the parent thread by polling for it to be done? Like passing a pointer to a flag and checking if it's done every once in a while (I need the parent thread to perform other tasks) — randomdood1923, May 31 '23 at 13:06
@randomdood1923 The code creating the thread, allocating the parameters and handling the thread callback should all be the same module. Then it is the same code handling allocation/deallocation, even if freeing is handled by the thread callback before exiting. — Lundin, May 31 '23 at 13:08
_"For example: in many cases, key points to a value that lives in the stack"_: if you have pointers to dynamic memory and to stack memory in the same tree, you're looking for trouble. — Jabberwocky, May 31 '23 at 13:08
To me it seems that the problem is that you are implementing a BST without knowing what you will use it for (that is a good exercise). Then you imagine a lot of different use cases (also good). Then you want to support all of those. This is problematic because it leads to the [soft coding anti-pattern](https://stackoverflow.com/questions/823610/what-is-soft-coding-anti-pattern). Unless you have a specific need, my advice would be to start implementing a BST with a simple value (e.g. an integer) and then extend the implementation if/when a need comes up. — nielsen, May 31 '23 at 13:29
@nielsen In the context of this project, a HTTP server built from the ground up, I use the BST to app paths to functions (key is path, value is struct that has the function to be called and some values that describe the path (content accepted, valid methods, etc.) (ex: http://exampledotcom/products calls a func that is valid with GET that expects an argument in the path and said func queries a db and writes to a client connected with a TCP/IP socket the result set of the query). I then decided to also utilize the BST to store headers I parse from requests to pass them to the function. — randomdood1923, May 31 '23 at 13:56

score 4 · Answer 1 · answered May 31 '23 at 15:54

Managing lifetime of objects is a very big topic in programming, and so is transferring permission of who manages that object lifetime. There are multiple ways you can approach the problem.

you can restrict that you use only dynamic allocation with standard C api.
you can restrict that all values are created and destroyed by the user. You API only takes and returns the values to add and remove, and user manages memory. This overall restricts the API design to most simple functions. See bsd queue.h https://github.com/openbsd/src/blob/master/sys/sys/queue.h#L175 .
You can pass an optional deallocation function and any other every time you need it. This results in very, very long function arguments each time you want to do something, but is good for "ad-hoc" functions.

void bsttree_free_cb(bsttree *tree, 
          void (*deallocate)(void *value, void *cookie),
          void *cookie) {
    for (every node to free) {
         if (deallocate) {
             deallocate(value, cookie);
         }
     }
}
void bsttree_free(bsttree *tree) {
    bsttree_free_cb(tree, NULL, NULL);
}

You store deallocation function(-s) with the tree and call it when freeing the tree. This is a great overall solution. With cookie you can for example pass all the indexes of keys allocated on the stack and exclude them from free(). Overall, this is the way you could consider.

typedef struct bsttree {
    struct bstnode *head;
    void (*deallocate_value)(void *value, void *cookie);
    void (*deallocate_key)(void *key, void *cookie);
    void (*other_operation_value)(void *value, void *cookie);
    void *cookie;
} bsttree;

You can store a deallocation function with each value and key and call it when deallocating. This will (greatly) increase the size of each element.

typedef struct bstnode {
    char *key;
    void *value;
    void (*deallocate_value)(void *value, void *cookie);
    void (*deallocate_key)(void *key, void *cookie);
    void (*other_operation_value)(void *value, void *cookie);
    void *cookie;
    struct bstnode *left;
    struct bstnode *right;
} bstnode;

Finally, you write your code templated over each type. Using macros or code generation - either X macros or by replacing the macros - generate a function for each deallocation function and type that the user needs. You might be interested in https://github.com/stclib/STC/tree/master project.

You may also be interested in https://docs.gtk.org/glib/struct.Tree.html .

John Bollinger · Answer 2 · 2023-05-31T15:03:10.147

is what I am doing bad practice? Is there a better way to go about it than having to write functions for every use case of the structure?

These are good questions. You are being attentive to some important details.

The bottom line is that if you want to accommodate the most general case, where your nodes can contain pointers to values of varying type and provenance, then yes, you need somehow to provide information that allows your tree-manipulation routines to trigger the correct cleanup when required.

Some possibilities include:

In each node, include a pointer to a function to use to deallocate the data for that node. This is among the most general alternatives available.
Equip your tree-manipulation functions with parameters by which you can provide a (pointer to a) deallocation callback that they are to use to free the data of any node that is to be deleted. This is often sufficient, but it does not easily handle cases where different cleanup mechanisms are required for different data in the same tree.
Provide a structure representing the tree overall, and attach appropriate cleanup functions to that. This allows for a cleaner API than the previous option, but it boils down to pretty much the same thing.

score 0 · Answer 3 · answered Jun 01 '23 at 22:47

The most common way to handle this in data structures is to recognize that data structures will point to data that they didn't allocate and don't control.

So the common way to handle this is that in your C program you'll create objects that you can add to the data structures. But the layer of code that does the allocating should also do the deallocating. It should also avoid leaving a node in your BST pointing to the deallocated object.

So provide your bst api with a bst_delete(bsttree*, char* key) function that will remove the bstnode from the bst (and free it). But won't deallocate the actual object that points to. Because that's not it's job.

The calling code should call bst_delete() and then free() the object that needs to be deleted.

Don't add objects allocated on the stack to the bsttree unless the tree is also destroyed in the lifetime of the function that is using that stack context. When destroying the bsttree you'll need to delete all it's nodes to avoid a memory leak.

Deallocating a data structure with a void* value

3 Answers3