17

This is valid, because a constexpr expression is allowed to take the value of "a glvalue of literal type that refers to a non-volatile object defined with constexpr, or that refers to a sub-object of such an object" (§5.19/2):

constexpr char str[] = "hello, world";
constexpr char e = str[1];

However, it would seem that string literals do not fit this description:

constexpr char e = "hello, world"[1]; // error: literal is not constexpr

2.14.5/8 describes the type of string literals:

Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration.

It would seem that an object of this type could be indexed, if only it were temporary and not of static storage duration (5.19/2, right after the above snippet):

[constexpr allows lvalue-to-rvalue conversion of] … a glvalue of literal type that refers to a non-volatile temporary object whose lifetime has not ended, initialized with a constant expression

This is particularly odd since taking the lvalue of a temporary object is usually "cheating." I suppose this rule applies to function arguments of reference type, such as in

constexpr char get_1( char const (&str)[ 6 ] )
    { return str[ 1 ]; }

constexpr char i = get_1( { 'y', 'i', 'k', 'e', 's', '\0' } ); // OK
constexpr char e = get_1( "hello" ); // error: string literal not temporary

For what it's worth, GCC 4.7 accepts get_1( "hello" ), but rejects "hello"[1] because "the value of ‘._0’ is not usable in a constant expression"… yet "hello"[1] is acceptable as a case label or an array bound.

I'm splitting some Standardese hairs here… is the analysis correct, and was there some design intent for this feature?

EDIT: Oh… there is some motivation for this. It seems that this sort of expression is the only way to use a lookup table in the preprocessor. For example, this introduces a block of code which is ignored unless SOME_INTEGER_FLAG is 1 or 5, and causes a diagnostic if greater than 6:

#if "\0\1\0\0\0\1"[ SOME_INTEGER_FLAG ]

This construct would be new to C++11.

Xeo
  • 129,499
  • 52
  • 291
  • 397
Potatoswatter
  • 134,909
  • 25
  • 265
  • 421

2 Answers2

6

The intent is that this works and the paragraphs that state when an lvalue to rvalue conversion is valid will be amended with a note that states that an lvalue that refers to a subobject of a string literal is a constant integer object initialized with a constant expression (which is described as one of the allowed cases) in a post-C++11 draft.

Your comment about the use within the preprocessor looks interesting but I'm unsure whether that is intended to work. I hear about this the first time at all.

Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • I think notes are non-normative, why note? – Cheers and hth. - Alf Sep 15 '11 at 21:05
  • 2
    I think the point is that the wording already supports this interpretation, it's just not obvious enough and there were doubts; hence a note to help interpret this particular corner case. Anyway, that's awesome news. – Pavel Minaev Sep 15 '11 at 22:11
  • 1
    Interesting! Since case 2 under 5.19 lvalue-to-rvalue conversion refers to subobjects, I wasn't sure whether case 1 applied to members of arrays. So, there is a lot of overlap between those cases. As for the preprocessor trick… `#if` relies on squishing identifiers down to `0` to sanitize constant expressions. Then 16.1/4 applies, "The resulting tokens comprise the controlling constant expression which is evaluated according to the rules of 5.19 using arithmetic that has at least the ranges specified in 18.3." Come to think of it, a UD-suffixed string is immune to the `0` conversion! Uh-oh! – Potatoswatter Sep 15 '11 at 22:48
  • Yes, I agree. The ud-suffix derives to an identifier, but as a preprocessing token (which is the smallest unit the preprocessor operates on) it is part of a user-defined-string-literal token. So `"foo"_x` will never be transformed to `"foo"0` according to my understanding. It stays `"foo"_x`, and then the question is what happens to the suffix: Is it ignored? Is it a error? The third (IMO unacceptable, because it goes way beyond the preprocessor) is - will it be looked up? – Johannes Schaub - litb Sep 17 '11 at 13:21
  • It looks to me like according to the current letter of the law, it needs to be looked up. A UD-literal has a value defined by a function call, and 5.19 gives rules about what function calls are allowed. This is impossible if it is a member of a template, of course. Probably a defect report is in order. Another spot in clause 16 assumes string decoration consists of only a possible leading `L`. And strings are just a special case; preprocessing numbers are allowed, which have not yet been converted to numeric values. GCC and Clang reject FP values, even though no standard forbids those. – Potatoswatter Sep 19 '11 at 03:52
1

Regarding your question about #if, it was not the intent of the standards committee to increase the set of expressions which can be used in the preprocessor, and the current wording is considered to be a defect. This will be listed as core issue 1436 in the post-Kona WG21 mailing. Thanks for bringing this to our attention!

Richard Smith
  • 13,696
  • 56
  • 78
  • [DR 366](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#366) covers the conditional preprocessing issue in terms of §5.19. Apparently the working draft of the standard was patched but the fix was overwritten by a later change. Probably the new fix belongs in §16, since the current intent is to have powerful constant expressions. – Potatoswatter Apr 08 '12 at 14:56