1

This is my very first question here, I think. I wrote my C code trying to respect "-Wall -Wextra -Werror -ansi -pedantic", but I can be flexible.

I wrote a function with the following prototype :

int messagequeue_add(messagequeue_t *p_queue, char* p_msg, const size_t p_size)

I use "realloc(p_msg, XXX)" in this function. Thus, it is not supposed to receive a string literal pointer, but only a dynamic pointer. How to forbid string literal pointers and allow string dynamic pointers only, in the function prototype, if possible with standard C89 or C99

I already met functions disallowing literal strings in their prototype arguments (error/warning at compile time), but I can not remember where and how it was defined.

I tried to Google a lot, went to my whole paper-library, this question is too specific and it was impossible to find the needle in the haystack. I'm quite experienced, but I do need help, this time.

Full function code is :

int messagequeue_add(messagequeue_t *p_queue, char* p_msg, const size_t p_size) {
    messagequeueitem_t l_messagequeueitem;

    /* Parameters sanity checks for early return */
    if (!p_queue) {
        warnx("%s:%s():%d (%s@%s): p_queue can not be NULL",
              __FILE__, __func__, __LINE__, __DATE__, __TIME__);
        return -1;
    }
    if (!(*p_queue)) {
        warnx("%s:%s():%d (%s@%s): *p_queue can not be NULL",
              __FILE__, __func__, __LINE__, __DATE__, __TIME__);
        return -1;
    }
    if ((!p_msg)&&(p_size)) {
        warnx("%s:%s():%d (%s@%s): p_msg can not be NULL with p_size>0",
             __FILE__, __func__, __LINE__, __DATE__, __TIME__);
        return -1;
    }

    ASSERT(p_queue);
    ASSERT(*p_queue);
    ASSERT( ((NULL==(*p_queue)->head)&&(NULL==(*p_queue)->tail))
            ||((NULL!=(*p_queue)->head)&&(NULL!=(*p_queue)->tail)));
    ASSERT((p_msg)||((!p_msg)&&(!p_size)));

    /* Create messagequeueitem_t structure */
    if (NULL == ((l_messagequeueitem) = (messagequeueitem_t)malloc(sizeof(struct messagequeueitem_s)))) {
        warn("%s:%s():%d (%s@%s): malloc failed",
             __FILE__, __func__, __LINE__, __DATE__, __TIME__);
        return -1;
    }

    ASSERT(l_messagequeueitem);
    l_messagequeueitem->size = p_size;
    l_messagequeueitem->next=NULL;

    /* Do not create a copy of the message but take ownership of the allocated
     * contents. */

    /* Resize the memory allocation to fit in the case sizeof(p_msg) != p_size */
    DBG_PRINTF("p_msg=%p",p_msg);
    if (NULL == ( l_messagequeueitem->msg = realloc( p_msg, l_messagequeueitem->size))) {
        warn("%s:%s():%d (%s@%s): realloc failed",
                __FILE__, __func__, __LINE__, __DATE__, __TIME__);
        free(l_messagequeueitem);
        return -1;
    }

    if ((*p_queue)->tail)
        (*p_queue)->tail->next = l_messagequeueitem;
    else
        (*p_queue)->head = l_messagequeueitem;
    (*p_queue)->tail = l_messagequeueitem;

    return 0;
}
  • 5
    You can't (except this code will probably give a const warning if you try to pass in a literal string), but also this code is very weird because if you realloc p_msg, it invalidates the caller's pointer, making it difficult to manage the memory correctly. – Paul Hankin Oct 28 '22 at 08:00
  • Perhaps your `p_msg`/`p_size` arguments are some kind of buffer that you manage in your code. Wrapping them in an abstraction (eg: a struct) and enforcing (or at least soft-enforcing) creation/deletion/manipulation of the abstraction via your own module may give you enough control. Hard to say without seeing more of your code. – Paul Hankin Oct 28 '22 at 08:04
  • The non-`const` nature of `p_msg` gives you an error (a warning without "-Werror") if you pass a string literal, which decays to a `const char *`. However, if you pass a pointer to a character array or a single character, writable, statically allocated or on the stack, the C language provides no measure against it. So, you cannot. – the busybee Oct 28 '22 at 08:16
  • Is this a function in a library, called by unknown future users? If so, you need something like the abstraction suggested by Paul. Else use unit testing and static code analysis to make sure _your own code_ calls the function correctly. – the busybee Oct 28 '22 at 08:18
  • @PaulHankin, @thebusybee String literals in C are not `const` qualified (even though they are not modifiable). Perhaps you are thinking of C++? – Ian Abbott Oct 28 '22 at 08:42
  • If you use `realloc(p_msg, XXX)` in the function, do the callers of `messagequeue_add` expect the pointer to be no longer valid on return from the function? Perhaps you could rewrite the function to allocate its own memory and not reallocate the caller's memory. – Ian Abbott Oct 28 '22 at 08:49
  • @IanAbbott yes, you're right, string literals aren't const-qualified in C and so my initial comment was wrong about this causing a const warning. – Paul Hankin Oct 28 '22 at 08:58
  • @PaulHankin, when called, this function becomes the "owner" or the pointer. The calling code is not supposed to manipulate it anymore, it is the contract. It enqueues a message, owns the message, and the code calling 'pop' will become the new owner of the pointer. I want to avoid memcpy and useless duplication for 2 reasons : RAM usage and performances. – François Cerbelle Oct 28 '22 at 09:28
  • @thebusybee It is mainly for my code, but can be used by others. Everything will be fully documented at release time. I discovered this when preparing and running all my unit tests. – François Cerbelle Oct 28 '22 at 09:28
  • Yes, if you give ownership of some memory, the new owner has to know how the memory was allocated (whether or not it uses realloc -- for example, at some point the memory has to be freed). C doesn't provide any convenient way to help with this. I'd probably just add a comment saying the memory needs to be allocated with malloc, but also try to avoid the need for callers to over-allocate if possible (meaning a potentially expensive realloc isn't needed). – Paul Hankin Oct 28 '22 at 10:52
  • This is all one big bug, since `realloc` should return the new pointer to the caller, since the caller returned from realloc may point at a different location. It could also be a null pointer. Start with fixing that bug, then you can concern yourself with additional safety from there. – Lundin Oct 28 '22 at 12:32
  • @Lundin I do not understand the bug you pointed : ``` if (NULL == ( l_messagequeueitem->msg = realloc( p_msg, l_messagequeueitem->size))) { warn("%s:%s():%d (%s@%s): realloc failed", __FILE__, __func__, __LINE__, __DATE__, __TIME__); free(l_messagequeueitem); return -1; } ``` if realloc fails, the function returns to caller with an error (-1) and the caller is still owner of the RAM pointed by p_msg (unchanged address) if realloc succeed, p_msg is no longer used and the reallocated address (same or not) is used instead. – François Cerbelle Oct 28 '22 at 12:59
  • " if realloc succeed, p_msg is no longer used and the reallocated address (same or not) is used instead." But `p_msg` is a local variable to the function, not returned to the caller. – Lundin Oct 28 '22 at 13:01
  • @PaulHankin Your last comment is exactly my point, I wanted to check if there is a way to tell the preprocessor to enforce that. – François Cerbelle Oct 28 '22 at 13:02
  • @Lundin yes, p_msg is a local variable, containing a copy of the address known by the caller. Thus, if realloc fails, the function returns an error (-1), the adress known by the caller function is still valid. – François Cerbelle Oct 28 '22 at 13:05
  • There is no guarantee that realloc returns a pointer to the same object if successful. All it guarantees is that data is preserved. – Lundin Oct 28 '22 at 13:08
  • https://stackoverflow.com/questions/44789295/correct-use-of-realloc – Lundin Oct 28 '22 at 13:08
  • @Lundin Indeed, if realloc fails, it garantees that the original address is still valid. my function returns an error, the caller knows that nothing happened, he is still owner of the block and still has the original unmodified address. On the other hand, if realloc succeed, the memory address might change but my function will complete and return a success to the caller. The caller knows that he is not longer owning the address, that the address might not be valid anymore and will not use it. – François Cerbelle Oct 28 '22 at 13:24
  • 1
    But that is just obscure... and in general, this is far too needlessly complex. Good programming = reducing complexity, not increasing it. I think the solution to all your problems is to use pure caller allocation and pass along a size. Split the memory allocation from the actual algorithm. – Lundin Oct 28 '22 at 13:35
  • @Lundin It does not solve the issue. A first caller (producer) allocates memory and 'push' the data in a queue. Another caller (consumer) will get the data from the queue and free the memory. memcpy is avoided for performances and RAM optimizations (would otherwise be too simple). The only solution is a contract : producer is owner until the data is acknoledged by the queue, the queue is owner until the data is fetch by a consumer. And my question was : **how to make the preprocessor refuse pointers on literal strings and only accept dynamically "mallocated" pointers.** – François Cerbelle Oct 28 '22 at 14:19
  • @FrançoisCerbelle You **can't** solve the issue. Checking for string literals is insufficient: `char str[] = "Some string";` is not a string literal, but your code will still fail if passed such an argument. You can't even check the value of the pointer and reliably see if it originated from heap memory. Pointers from `mmap()` can originate from `malloc()` or a user's direct call to `mmap()` itself, for example. Just `strdup()` the string at the start, claim ownership, and be done with it. – Andrew Henle Oct 28 '22 at 14:47

1 Answers1

0

I wrote a function with the following prototype :

int messagequeue_add(messagequeue_t *p_queue, char* p_msg, const size_t p_size)

I use "realloc(p_msg, XXX)" in this function. Thus, it is not supposed to receive a string literal pointer, but only a dynamic pointer. How to forbid string literal pointers and allow string dynamic pointers only, in the function prototype, if possible with standard C89 or C99

No version of standard C provides a mechanism to distinguish between pointers (in)to string literals and other pointers to characters. The elements of the array that a character string literal represents have type char, so pointers to them can have type char *, C does not specify the presence of any metadata by which these can be distinguished from other values of the same type.

This is usually dealt with via coding convention. Specifically, as a matter of code convention,

  1. a pointer to or into a string literal may be assigned only to variables and function prameters that are pointers to const data, and

  2. const correctness is rigorously observed (no casting away const or ditching it by other means).

That implies that where you want to accept pointers that may point to string literals, you use type const char *.

Adhering to such conventions precludes providing pointers to string literals to code that expects pointers to modifiable data.


Specific compilers may offer features that help. For example, many have ways to generate warnings and / or errors for violations of const correctness. Some have options specifically aimed at detecting attempts to modify string literals. For example, if you compile with GCC's and enable its static analyzer then it will detect (at least some) attempts to write to const data and string literals.

BUT

You have characterized your problem too narrowly. If you want to be able to realloc() the p_msg pointer passed to your function, then that pointer must have been previously obtained by dynamic allocation (and not since free()d). That is a much stronger requirement than that it not point to a string literal, so focusing on the string literal angle may not be so useful.

You should absolutely document this requirement, and you should propagate that requirement to the documentation of other functions as appropriate. But you're not likely to get a compiler assist like you could if the requirement were merely that the pointed-to data could be modified. C does not provide any value or data type property that describes data being modifiable but not de-/re-allocatable. There are, however, tools for detecting such issues at runtime. Valgrind can do that, for example.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • Thanks a lot. In any case, I'll document thoroughly this function, for sure, I only wanted to know if a prototype syntax would say and enforce the opposite of "const". I thought that it existed, but I was wrong. Thanks a lot for your help. The question is closed/solved. I'll try to do it (this was my very first question asked here). – François Cerbelle Oct 28 '22 at 16:33