1

https://godbolt.org/z/qZVO3a

This is a minimal reproduction of the warnings I see. Obviously UB can be bad, but I think while many of the below situations are okay, there's some really nasty uses and I need to determine which require corrective action.

#include <stdarg.h>
#include <stdio.h>
#include <limits.h>

typedef struct _thing {

    char  first[4];
    char  second[10];
    char  last[111];
}THING;


void custom_printf(char* _format, ...) __attribute__((format(printf, 1,2)));
void custom_printf(char* _format, ...) 
{
    // get buffer from some source
    char buffer[1024];
    va_list ap;
    va_start(ap, _format);
    vsnprintf(buffer, 1024, _format, ap);
    va_end(ap);
    // use buffer for some purpose

}

int main(){

    custom_printf("HI THERE%d");
    custom_printf("HI THERE", 1);
    custom_printf("val: %d", (void*)0);
    custom_printf("val: %p", 0);
    custom_printf("val: %lld", 1);
    custom_printf("val: %s", (THING){"A", "AA", "CCCC"});
    custom_printf("val: %0.30s","HI");
    custom_printf("val: %d",LLONG_MAX);
}

The warnings see include:


<source>: In function 'main':

<source>:26:5: warning: format '%d' expects a matching 'int' argument [-Wformat]

<source>:27:5: warning: too many arguments for format [-Wformat-extra-args]

<source>:28:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'void *' [-Wformat]

<source>:29:5: warning: format '%p' expects argument of type 'void *', but argument 2 has type 'int' [-Wformat]

<source>:30:5: warning: format '%lld' expects argument of type 'long long int', but argument 2 has type 'int' [-Wformat]

<source>:31:5: warning: format '%s' expects argument of type 'char *', but argument 2 has type 'THING' [-Wformat]

<source>:32:5: warning: '0' flag used with '%s' gnu_printf format [-Wformat]

<source>:33:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long long int' [-Wformat]

<source>:34:1: warning: control reaches end of non-void function [-Wreturn-type]

Compiler returned: 0

It's my understanding that the above has many flavors of UB here. After looking around I've seen that I should just fix the above. Now I want to eventually fix them all, but for now my curiousity is making me wonder which is the worst scenario. I'd assume that cases like the first where I'm not passing in enough items.

It's my understanding that in the above I have:

  1. Popping off stack that doesn't exist
  2. Not popping enough off the stack
  3. Padding a string with leading zeros
  4. Casting integer to pointer
  5. Casting a struct that can be cased to

Out of the above I'm fairly certain that anything that pops off the stack that doesn't exist will lead to the worst scenario. But I'm also wondering what the other severe cases are.

will.mont
  • 81
  • 5
  • 3
    #2 is perfectly defined. – melpomene Mar 13 '19 at 02:22
  • @melpomene: Hmm? If a called function fails to pop enough stuff from the stack and then returns, the calling function will have a malformed stack. – Eric Postpischil Mar 13 '19 at 11:17
  • @EricPostpischil The contents of `custom_printf(char* _format, ...)` do not pop stuff - calling code handles that. `custom_printf()` simply examines the args and maybe not all of them. – chux - Reinstate Monica Mar 13 '19 at 11:33
  • @chux: Whether `custom_printf` pops anything is a separate question from whether not popping enough from the stack would have undefined behavior. – Eric Postpischil Mar 13 '19 at 11:45
  • 1
    @EricPostpischil yes, perfectly well-defined. The callee *cannot* pop stuff from stack with varargs functions. That's why the stdcall of MSVC cannot be used for varargs, and the docs say that: "The __stdcall calling convention is used to call Win32 API functions. The callee cleans the stack, so the compiler makes vararg functions __cdecl. Functions that use this calling convention require a function prototype." – Antti Haapala -- Слава Україні Mar 13 '19 at 13:17
  • @AnttiHaapala: Again, whether `custom_printf` pops anything is a separate question from whether not popping enough from the stack would have undefined behavior. You are asserting that `custom_printf`, while using variable arguments, does not pop anything from the stack, and hence the situation of not popping enough from the stack does not occur in this case. But that has nothing to do with whether, if not enough items are popped from the stack (which might have to be in a different situation), undefined behavior would result. – Eric Postpischil Mar 13 '19 at 13:36
  • @EricPostpischil Sounds like you are suggesting `foo(char *f,...) { va_list ap; va_start(ap, f); va_end(ap); } f("xx",3,4,5);` is potential UB as `foo()` did not use the `3,4,5`. Is that an example of the potential UB you see? If not, could you point to one? – chux - Reinstate Monica Mar 13 '19 at 18:24
  • @chux: No. melpomene commented that “#2 is perfectly defined.” That seems to refer to “2. Not popping enough off the stack.” I am not discussing **at all** whether not popping enough off the stack occurs in this situation or not. I am discussing whether **when a situation occurs where not enough is popped off the stack**, the resulting behavior is perfectly defined or not. That could be a caller using a declaration of `void foo(int, int, int)` but the definition is `void foo(int, int)`, and the ABI requires the called routine to pop the arguments. – Eric Postpischil Mar 13 '19 at 19:07
  • @EricPostpischil As this issue if about "variadic arguments", we need to consider `foo(type *bar, ...);` and the like. Does you UB concern still apply there? – chux - Reinstate Monica Mar 13 '19 at 22:48
  • @EricPostpischil This question is about C. There is no stack. – melpomene Mar 13 '19 at 23:18
  • @melpomene: The fact the C standard does not require a hardware-implemented stack does not mean there is no stack. And, if there is no stack, why did you say #2 is “perfectly defined”? – Eric Postpischil Mar 13 '19 at 23:36
  • @EricPostpischil Even if there is a stack, you can't pop it from C. #2 is the second printf, `custom_printf("HI THERE", 1);`. – melpomene Mar 13 '19 at 23:51
  • @melpomene: Uh, okay, when you use “#2” to refer to something in text that contains a numbered listed but are not referring to an item in the list, you might want to rethink that. – Eric Postpischil Mar 14 '19 at 00:12

1 Answers1

3

At which point is the undefined behavior problematic?

All UB is problematic.

Identifying a particular compiler version's UB effects has some merit in problem solving. Yet one should never rely on that UB effect to persist.

My answer is based on C in general, not on gcc 4.7.


Consider that objects are not necessarily passed using the same mechanism across types. Related true example: float/double passed in a FP stack and other types via the usual stack. printf("%llx\n", 1.234); can fail badly, even though the size passed is 8 and 8 is expected, yet they are in different places. A similar difference could occurs between pointer types and integers (although that sounds like a unicorn platform).


Leaving UB in code in inefficient in development.
Consider if one did find some UB that worked great in a select case, the next compilation or version may render different results. By fixing, you save time not trying to explained how "this UB is OK, I know I tested it" during a code review. Also save time not needing to find a way to quiet the warning of this one "good" UB. The programming team that has to maintain your UB code will mutter evil things about the prior coder.


UB Missing matching argument.

custom_printf("HI THERE%d");
<source>:26:5: warning: format '%d' expects a matching 'int' argument [-Wformat]

Not UB. Extra args are OK, yet likely is a coding mis-step - hence the warning. @melpomene

custom_printf("HI THERE", 1);
<source>:27:5: warning: too many arguments for format [-Wformat-extra-args]

UB. intand void * may different size, legal values and function passing mechanisms,

custom_printf("val: %d", (void*)0);
<source>:28:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'void *' [-Wformat]

UB. same as line 28

custom_printf("val: %p", 0);
<source>:29:5: warning: format '%p' expects argument of type 'void *', but argument 2 has type 'int' [-Wformat]

UB. intand long long may different size and function passing mechanisms,

custom_printf("val: %lld", 1);
<source>:30:5: warning: format '%lld' expects argument of type 'long long int', but argument 2 has type 'int' [-Wformat]

UB. Types may different in size, legal values and function passing mechanisms,

custom_printf("val: %s", (THING){"A", "AA", "CCCC"});
<source>:31:5: warning: format '%s' expects argument of type 'char *', but argument 2 has type 'THING' [-Wformat]

UB: Invalid standard specifier %0.30s, anything may happen. Well behaved on select systems that define behavior for this non-standard specifier.

custom_printf("val: %0.30s","HI");
<source>:32:5: warning: '0' flag used with '%s' gnu_printf format [-Wformat]

UB like line 30

custom_printf("val: %d",LLONG_MAX);
<source>:33:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long long int' [-Wformat]

Not UB with main(). Only a UB problem with functions in general if calling code use the return value. Yet main() is special in that code acts as if a return 0; was at the end - if that function does not end with a return.

<source>:34:1: warning: control reaches end of non-void function [-Wreturn-type]
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Ive always heard that UB can be used to benefit the compiler. Here is a case where in some cases it hides silly errors. As a C developer should I not take advantage of UB? – will.mont Mar 13 '19 at 03:39
  • @will.mont What warnings are you rating as "silly" errors" – chux - Reinstate Monica Mar 13 '19 at 03:41
  • Take advantage of UB when you can do a better job than the complier with its team of senior developers and thousands of reviews - which is taking advantage of your code. Remember that with each new version of the compiler you try, you will need to re-assess this gambit. – chux - Reinstate Monica Mar 13 '19 at 03:54
  • @will.mont UB may allow the compiler to do better optimization or whatever. If the result of your code is undefined, the compiler can create anything it wants. It cannot be wrong by definition. UB does not allow the developer to do any better it just gets worse code. – Gerhardh Mar 13 '19 at 11:54
  • And the line 34 is not UB per-se... the function is required to return. The return value is ~indeterminate. You make it sound like the omitted return is UB but not a problem ;) – Antti Haapala -- Слава Україні Mar 13 '19 at 13:20
  • @AnttiHaapala The non-UB-ness of line 34 detailed. – chux - Reinstate Monica Mar 13 '19 at 13:23