4

I have noticed that a large number of C compilers issue warnings when the conversion specifiers in the format string of the printf/sprintf functions do not match the type or the count of the corresponding arguments.

That seems to me like a conceptual break since C doesn't have built-in functions according to the language specification.

All the compiler should know about printf/sprintf is their prototypes and not their semantics. I know that printf/sprintf are standard C functions, but yet they reside in a separate library, libc, and you have to include stdio.h to import their prototypes.

What many compilers do instead is analyze the format string which could as well be supplied at runtime.

Does the above make sense?

Blagovest Buyukliev
  • 42,498
  • 14
  • 94
  • 130
  • A conceptual break from what? – Martin Ba Sep 14 '10 at 14:10
  • One word: QOI (well: 3 in fact) – pmg Sep 14 '10 at 14:10
  • @Martin: As far as I know, C doesn't have built-in functions. How then the compiler understands the semantics of the printf/sprintf functions when they are defined in an outside library? – Blagovest Buyukliev Sep 14 '10 at 14:15
  • Nope, no sense. What's a conceptual break in this context? – Matt Joiner Sep 14 '10 at 14:15
  • 1
    @Blagovest: the semantics aren't defined in any library, they're defined in the standard (mostly: the implementation can add new format specifiers). Your implementation might use some linked shared object to supply the actual code, but that doesn't change the defined behaviour of the the function. – Steve Jessop Sep 14 '10 at 14:21
  • 1
    @Blagovest Buyukliev: Much of the time these things are enforced by compiler-specific attributes in the function declarations (e.g. `__attribute__((format(printf,m,n)))` in gcc) that say whether a function has `printf`-like semantics. – jamesdlin Sep 14 '10 at 20:02
  • @jamesdlin: +1 for hitting the nail on the head! That really answers my question. – Blagovest Buyukliev Sep 15 '10 at 10:21

7 Answers7

11

"All the compiler should know about printf/sprintf is their prototypes and not their semantics".

That's the part that isn't true. As far as the standard is concerned, any part of a C implementation is "allowed" to know about any other part, and to issue diagnostics that may be helpful to the user. Compiler intrinsics aren't required by the standard, and neither is this particular diagnostic, but they certainly aren't forbidden.

Note that (as far as the standard is concerned) the standard library is special, it's not just any old linked library. If a particular implementation/compiler even provides a mechanism for the user to link against a different version of the standard library, the standard certainly doesn't require it to "work" when that alternative library has different semantics from what is laid out in the standard.

So in that sense, everything in the standard library is "bult-ins". It's part of the C language specification. Compilers are allowed to act on the assumption that it behaves as the standard requires.

Of course, if the format specifier isn't known until runtime, then the compiler can't do a static check of the varargs. But when it is known at compile time, the compiler can assume the behaviour of printf just as validly as it can assume the behaviour of memcpy, or of integer addition.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
2

If I read your question correctly, I agree with your premise that verification of printf and friends' format strings by the compiler is an activity conceptually unlike the other sorts of static checking (syntax, type, etc.) done by the compiler.

However, it is permitted by the standard, and helps us poor programmers out greatly.

Derrick Turk
  • 4,246
  • 1
  • 27
  • 27
2

Compiler's task here is just to give you some useful hints. This behaviour is not covered by standard.

An implementation may generate warnings in many situations, none of which are specified as part of this International Standard.

In theory, nothing prevents compiler from warning you about (potentially) incorrect usage of, say, QT library.

And printf is standard function in the sense that it (including its semantics) is covered by ISO C standard.

Roman Cheplyaka
  • 37,738
  • 7
  • 72
  • 121
2

The Standard requires diagnostics under some circumstances, but does not disallow general diagnostics. Any implementation is free to issue a diagnostic for any reason, including improper use of printf() or the overuse of the letter Q. Obviously, some of these reasons are more useful than others.

Moreover, if you include a library, all the visible identifiers in it become reserved. You are not allowed to #include <stdio.h> and have your own definition of printf (see 7.1.3 of the draft C99 standard). This means that the implementation is free to assume that you're using the standard printf and treat it as if it were a required part of the standard.

David Thornley
  • 56,304
  • 9
  • 91
  • 158
  • "The Standard requires diagnostics under some circumstances" -- no it doesn't, see quotation in my answer (it's from ISO/IEC 9899:TC2). Or can you point me to a place (possibly in some other ISO or ANSI C standard?) where it does? – Roman Cheplyaka Sep 14 '10 at 14:35
  • 1
    I have the draft C99 standard here (don't own a real copy), and in 5.1.1.3 I find "A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined." Is this omitted from the final version? – David Thornley Sep 14 '10 at 14:39
  • @Roman: Also, it looks to me that that's from the start of Annex I (informative), titled "Common Warnings". It refers to warning messages and not diagnostic messages (in standards, the exact wording can be important) and apparently isn't normative anyway. – David Thornley Sep 14 '10 at 14:43
  • okay, now I see what you're referring to, thanks for explanation. – Roman Cheplyaka Sep 14 '10 at 15:00
  • 1
    You're not allowed to write your own version of `printf`, period, unless you make it `static`. Redefining external symbols, not to mention ones defined by the standard, results in undefined behavior. Whether you included `stdio.h` or not is a much more minor issue. – R.. GitHub STOP HELPING ICE Sep 14 '10 at 17:16
1

These kind of warnings indicate likely bugs and, as a result, are useful.

Yes, it might look inconsistent to have special cases for warnings in the compiler(assuming <stdio.h> doesn't just have a __printf_format_warning attribute or something like that), but then again, if it useful and helps solve some bugs(maybe even security bugs), then why not have them?

I mean, it's not like everyone just replaces their libc with their own, with different printf semantics...

luiscubal
  • 24,773
  • 9
  • 57
  • 83
1

The ultimate purpose of programming language standards is to help programmers write programs that behave as intended. There is nothing in the standard that says a compiler should issue a warning if it encounters "bigvar = byte3 << 24 + byte2 << 16 + byte1 << 8 + byte0;", but since the results probably aren't what the programmer intended, many compilers will issue a warning. The only limitation the standards impose upon warnings is that they must not prevent the successful compilation of a legitimate program (e.g. a compiler which failed with an error after outputting 999 warnings, or which output so many warnings that compilation would, for all practical purposes, never complete, would be non-conforming).

There isn't any requirement that the compiler "know" about the standard libraries, but nor is there any requirement that it not know about them if someone #includes the normal headers. Indeed, if a program includes <stdio.h> I think it would be permissible under the standard for a compiler to replace a printf call that it can understand with something that might be easier to process at run-time (e.g. it could replace printf("Q%5d",foo); with "putch('Q'); __put_int(foo,5);" if desired). If a program does not #include <stdio.h> such translation would be forbidden.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

Is it a conceptual break? Yes. But the "conceptual break" is that for all intents and purposes, C does now have "built-in functions". They're specified in the Standard, their names are all reserved (unless you're in a freestanding environment), and compilers are special-casing them in all sorts of ways.

For example, if I call printf without properly including <stdio.h>, I'll typically get a warning like "incompatible implicit declaration of built-in function ‘printf’" (gcc), or "implicitly declaring library function 'printf' with type 'int (const char *, ...)'" (clang). Both messages prove that the compiler knew about printf all along, whether or not I explicitly included a header file with an external declaration that told the compiler something.

Given that compilers do know about library functions, it's perfectly appropriate that one of the things they do with this knowledge, in the case of printflike functions, is to double-check the actual arguments.

Yes, I do remember the days when library functions truly weren't built in, in any way, and there were things about those days that I miss, but really, the fact that compilers do now know about library functions isn't causing me any problems or costing me any sleep.

And I firmly believe that for a modern compiler to check the number and types of arguments handed to printflike functions is more or less mandatory. Back in the good old days, back when library functions weren't part of the language, it was also the case that function prototypes didn't exist. If a programmer wanted to ensure that a function call matched its definition, it was up to the programmer either to be careful, or to run lint. But prototypes changed that, and today's programmers know that it's fine to call, say, sqrt(144). But for the same reason, today's programmers don't understand why they can't call printf("%f\n", 144). And I really can't blame today's programmers for that.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103