5

Reading Stanley Lippman's "C++ Primer", I learned that by default decimal integer literals are signed (smallest type of int, long or long long in which the literal's value fits) whereas octal and hexadecimal literals can be either signed or unsigned (smallest type of int, unsigned int, long, unsigned long, long long or unsigned long long in which the literal's value fits) .

What's the reason for treating those literals differently?

Edit: I'm trying to provide some context

int main()
{
    auto dec = 4294967295;
    auto hex = 0xFFFFFFFF;
    return 0;
}

Debugging following code in Visual Studio shows that the type of dec is unsigned long and that the type of hex is unsigned int.
This contradicts what I've read but still: both variables represent the same value but are of different types. That's confusing me.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Please give some context. – Weather Vane Apr 11 '16 at 22:08
  • 1
    Because when a programmer dealing with non-decimal literal, it is usually for some kind of bit manipulations, where the sign is not important, but the physical representation. – Eugene Sh. Apr 11 '16 at 22:09
  • 1
    I actually thought integer literals are always int/unsigned int and can overflow, because GCC warns about it. – user3528438 Apr 11 '16 at 22:11
  • @user3528438 It warns because signed overflow is undefined. `-1 < 10` will always return true, so it is signed. – Eugene Sh. Apr 11 '16 at 22:15
  • This sure sounds specific enough to be mentioned in the C specifications somewhere. I always thought of number notation as ... neutral. Surely `02` is the same as `2` and `0x2`? – Jongware Apr 11 '16 at 22:15
  • The literal value you give does not drive the variable type: that comes first. – Weather Vane Apr 11 '16 at 22:18
  • @WeatherVane C Standard: "The type of an integer constant is the first of the corresponding list in which its value can be represented." – AlexD Apr 11 '16 at 22:24
  • @AlexD Right, apparently OP mistakenly used a C tag on a C++ question. – 2501 Apr 11 '16 at 22:29
  • In terms of `auto` and the type selected, you may have encountered a Visual Studio bug. – jxh Apr 11 '16 at 22:47
  • "This contradicts what I've read " - what did you read? It is correct according to what you said in your first paragraph. – M.M Apr 12 '16 at 01:08
  • @M.M I thought (according to my first paragraph) that decimal literals can only be of the types `int`, `long` or `long long`. So `dec` shouldn't be an `unsigned int` but according to the Visual Studio debugger it is. – binary-riptide Apr 12 '16 at 05:56
  • 2
    You're right that decimal literals can only be signed types *if the literal fits in a signed type*. If it doesn't then it's undefined behaviour. Prior to C++11 which added `long long`, the `4294967295` did not fit in any signed type so it was undefined behaviour (which can be manifested as generating an unsigned literal of some type, or anything else). C++11 didn't change the nature of the rules, it just added `long long int` to the list. If your compiler is supposed to be C++11-complaint (VS2013 and later?) and it does what you said then it is non-conforming, which might be deliberate – M.M Apr 12 '16 at 06:48
  • I think C89 defined that the literal would be `unsigned long` in this case, so MS may have shared code between their C and C++ compilers – M.M Apr 12 '16 at 06:50

1 Answers1

5

C++.2011 changed its promotions rules from C++.2003. This change is documented in §C.2.1 [diff.cpp03.lex] :

2.14.2
Change: Type of integer literals
Rationale: C99 compatibility

The C Standard, both C.1999 and C.2011, defines the conversions in §6.4.4.1. (C++.2011 §2.14.2 essentially copies the content from the C Standard.)

The type of an integer constant is the first of the corresponding list in which its value can be represented.

enter image description here
larger image

The C.1999 rationale gives the following explanation:

The C90 rule that the default type of a decimal integer constant is either int, long, or unsigned long, depending on which type is large enough to hold the value without overflow, simplifies the use of constants. The choices in C99 are int, long and long long. C89 added the suffixes U and u to specify unsigned numbers. C99 adds LL to specify long long.

Unlike decimal constants, octal and hexadecimal constants too large to be ints are typed as unsigned int if within range of that type, since it is more likely that they represent bit patterns or masks, which are generally best treated as unsigned, rather than “real” numbers.

jxh
  • 69,070
  • 8
  • 110
  • 193
  • 3
    The c++ standards states: **Change: Type of integer literals. Rationale: C99 compatibility.** for the changes in c++11. p 1245 N3797. Feel free to include it in your answer. The tags are now c++ only. – Captain Giraffe Apr 11 '16 at 22:36