23

This code:

#define __STDC_FORMAT_MACROS
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(int argc,char **argv)
{
   uint64_t val=1234567890;
   printf("%"PRId64"\n",val);
   exit(0);
}

Works for C99, C++03, C++11 according to GCC 4.5, but fails on C++11 according to GCC 4.7.1. Adding a space before PRId64 lets GCC 4.7.1 compile it.

Which one is correct?

Xeo
  • 129,499
  • 52
  • 291
  • 397
rubenvb
  • 74,642
  • 33
  • 187
  • 332
  • 1
    actually, you need PRIu64, not PRId64, to print unsigned (in general PRI{o,u,x,X}N for unsigned, and PRI{i,d}N for signed) – Ambroz Bizjak Aug 08 '12 at 19:42

1 Answers1

20

gcc 4.7.1 is correct. According to the standard,

2.2 Phases of translation [lex.phases]

1 - The precedence among the syntax rules of translation is specified by the following phases. [...]
3. The source file is decomposed into preprocessing tokens (2.5) and sequences of white-space characters (including comments). [...]
4. Preprocessing directives are executed, macro invocations are expanded, [...]

And per 2.5 Preprocessing tokens [lex.pptoken], user-defined-string-literal is a preprocessing token production:

2.14.8 User-defined literals [lex.ext]

user-defined-string-literal:
    string-literal ud-suffix
ud-suffix:
    identifier

So the phase-4 macro expansion of PRId64 is irrelevant, because "%"PRId64 has already been parsed as a single user-defined-string-literal preprocessing token consisting of string-literal "%" and ud-suffix PRId64.

Oh, this is going to be awesome; everyone will have to change

printf("%"PRId64"\n", val);

to

printf("%" PRId64"\n", val);     // note extra space

However! gcc and clang have agreed to treat user-defined string literals without a leading underscore on the suffix as two separate tokens (per the non well formedness criterion), see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52538 so for future versions of gcc (4.8 branch, I think) existing code will work again.

ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • But how on earth can a user-defined literal be defined/parsed *before* platform ifdef's are processed? UDL can do all `constexpr` stuff, right? – rubenvb Aug 08 '12 at 17:10
  • @rubenvb that's fine; `constexpr` is a phase-7 process. – ecatmur Aug 08 '12 at 17:17
  • @rubenvb: but the UDL will be gone by then, due to preprocessor macro replacement in phase 4. – rubenvb Aug 08 '12 at 17:18
  • @rubenvb if it helps you understand, `"lit"_udl` is treated as `operator "" _udl("lit", 3)`; this happens early in phase 7. – ecatmur Aug 08 '12 at 17:23
  • but by the time that happens, in this example, you `_udl` is already replaced by the preprocessor... – rubenvb Aug 08 '12 at 17:25
  • 1
    @rubenvb ah right; there's nothing there for macro replacement to hit; `"%"PRId64` is a *single* token. – ecatmur Aug 08 '12 at 17:25
  • 3
    Ick. IMHO writing it with spaces: `printf("%" PRId64 "\n", val);` is better style anyway, but it's still a change that breaks existing code. Incidentally, ud-suffixes not starting with an underscore are reserved for future standardization, so the program is "ill-formed, no diagnostic required". Reference: [N3337](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf) 2.14.8p10 and 17.6.4.3.5 – Keith Thompson Aug 08 '12 at 18:42
  • @rubenvb this has caused some discussion and there is a fix in the works; see my latest edit. – ecatmur Aug 08 '12 at 22:22
  • Good for gcc and clang -- but the code is still ill-formed as far as the language standard is concerned, and other compilers won't necessarily do the same thing. The bug report mentions that the committee is considering this issue. Also `` is probably the most common source of this problem, but other code could also be broken by the change. – Keith Thompson Aug 09 '12 at 00:13
  • 1
    "lit"_udl is a single token in C++11. So it precedes the preprocessor. – emsr Aug 09 '12 at 03:11
  • Sorry if I'm repeating but is adding the space the solution for this? Is it guaranteed not to change the behaviour? – Hanna Khalil Aug 15 '16 at 13:50
  • @HannaKhalil yes, adding the space is the solution and will work on all compilers (since compile-time string concatenation works however much whitespace there is). Omitting the space was only ever a stylistic matter. – ecatmur Aug 15 '16 at 13:57