C++ and unsigned types

Question

I'm reading the C++ Primer 5th Edition, and I don't understand the following part:

In an unsigned type, all the bits represent the value. For example, an 8-bit unsigned char can hold the values from 0 through 255 inclusive.

What does it mean with "all the bits represent the value"?

In the same way as all the fingers on your hand represent your entire hand. — Sam Varshavchik, Jul 01 '18 at 02:43
I suppose this statement is not correct. In both signed and unsigned types all bits represent the value. For example 8 bits can represent 256 unique values regardless where the type is signed or unsigned. — 273K, Jul 01 '18 at 03:52
It means that all bits contained in the variable can be set differently, and each combination of which bits are set, or not set, corresponds to a distinct value. So changing any bit changes the value or (conversely) changing the value means at least one bit has changed. Note: the statement is in error, since it implies this is only true for unsigned types, but it is true for some other types as well. — Peter, Jul 01 '18 at 04:54
@S.M. this statement is about unsigned types, it is saying nothing about signed types so I am not sure how you consider it incorrect — M.M, Jul 02 '18 at 09:22
Don't worry about your English, your question is clear and well phrased. And people here will happily help with grammar and spelling if everything else is written well. — StoryTeller - Unslander Monica, Jul 02 '18 at 10:06
@S.M. your statement is incorrect. There can be padding bits/trap representations in types wider than char. In that case not all bits represent the value — phuclv, Jan 13 '20 at 05:36

score 1 · Answer 1 · answered Jul 01 '18 at 02:48

1

You should compare this to a signed type. In a signed value, one bit (the top bit) is used to indicate whether the value is positive or negative, while the rest of the bits are used to hold the value.

answered Jul 01 '18 at 02:48

John Burger

3,662
1
13
23

This is one possible implementation of `signed` integers, but this is very rare in the real world. It's known as ones complement. Real systems including x86, ARM, MIPS, POWER and SPARC all use twos complement, where this is not true. – MSalters Jul 02 '18 at 08:35
@MSalters all of the 3 permitted representations have the top bit being the sign bit, with 1 being negative; and in all cases the other bits hold the value (the representations differ in which bit settings correspond to which values) – M.M Jul 02 '18 at 09:25
@M.M: A "sign bit" implementation has `-1` equal to `1000...1`, i.e. the values X and -X differ only in a single bit, and the **other bits** encode `abs(X)`. Two's complement doesn't have a sign bit; **all bits** have a specific value. The top bit just happens to be the only bit with a negative value. – MSalters Jul 02 '18 at 10:24
@MSalters You're describing sign-magnitude representation. The term "sign bit" refers to the bit holding the sign in all 3 representations – M.M Jul 02 '18 at 10:42
@MSalters When I see that the top bit of a twos-complement number is a `1`, I know it is negative. I then invert all the bits, add one to the result, and get the value that is negative. My description specifically avoided saying ones vs twos complement, since in both systems the top bit is the sign indicator, while the remaining bits are the value - you just need to interpret them differently. – John Burger Jul 02 '18 at 11:32
Ones complement ‘suffers’ from the fact that there are two interpretations of zero: `-0` and `+0`. Twos complement ‘suffers’ from the fact that the largest positive number representation isn’t as large as the largest (absolute) negative representation - so (16-bit) `-(-32,768)` is `-32,768`. – John Burger Jul 02 '18 at 11:37
Twos complement has ‘won’ the representation ‘war’ because it makes for ‘natural’ arithmetic. The ALU doesn’t have to treat the sign bit differently: it can treat all operations as unsigned. It is important, however, for the flags to track the states of the top and second-top bits to check for overflow – John Burger Jul 02 '18 at 11:40

score 1 · Answer 2 · answered Jul 02 '18 at 10:03

The value of an object of trivially copyable type is determined by some bits in it, while other bits do not affect its value. In the C++ standard, the bits that do not affect the value are called padding bits.

For example, consider a type with 8 bits where the last 4 bits are padding bits, then the objects represented by 00000000 and 00001111 have the same value, and compare equal.

In reality, padding bits are often used for alignment and/or error detection.

Knowing the knowledge above, you can understand what the book is saying. It says there are no padding bits for an unsigned type. However, the statement is wrong. In fact, the standard only guarantees unsigned char (and signed char, char) has no padding bits. The following is a quote of related part of the standard [basic.fundamental]/1:

For narrow character types, all bits of the object representation participate in the value representation.

Also, the C11 standard 6.2.6.2/1 says

For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter).

score 0 · Answer 3 · answered Jul 01 '18 at 02:46

0

It means that all 8 bits represent an actual value, while in signed char only 7 bits represent actual value and 8-th bit (the most significant) represent sign of that value - positive or negative (+/-).

answered Jul 01 '18 at 02:46

Jovibor

759
2
6
16

Not really true. The hardware does not treat the top bit in a special way when doing basic arithmetic. A signed number still uses all the bits. It still has 256 states. Signed/unsigned is more about what range the number represents. Top bit is not special. Just happens to tell you the sign when its a signed number. – William J Bagshaw Jul 02 '18 at 10:20
@William that's exactly what my answer is about. – Jovibor Jul 02 '18 at 11:58
At face value what you say is wrong. -1 is not 1 with the top bit set. (0x81) A signed char is not a seven bit number with a sign bit. It is much more helpful to see it as an 8 bit signed number. -1 + 1 is zero, therefore we know -1 is all bits set as when we add 1 to it we get zero. (0xFF + 0x01 = 0x00) It is interesting to observe that all numbers with the top bit set are negative. But its not of the sign of a seven bit number. – William J Bagshaw Jul 02 '18 at 15:00

score 0 · Answer 4 · answered Jul 01 '18 at 02:48

For example, one byte contains 8 bits, and all 8 bits are used to counting up from 0.

For unsigned, all bits zero = 00000000 means 0, 00000001 = 1, 00000010 = 2, 00000011 = 3, ... up to 11111111 = 255.
For a signed byte (or signed char), the leftmost bit means the sign, and therefore cannot be used to count. (I am optically separating the leftmost bit!) 0 0000001 = 1, but 1 0000001 = -1, 0 0000010 = 2, and 1 0000010 = -2, etc, up to 0 1111111 = 127, and 1 1111111 = -127. In this example, 1 0000000 would mean -0, which is useless/wasted, so it can mean for example 128.

There are other ways to code the bits into numbers, and some computers start from the left instead from the right. These details are hardware specific, and not relevant to understand 'unsigned', you only need to care about that when you want to mess in the code with the single bits (not recommended).

... except that sign-magnitude encoding is almost never used for integers, only floating-point — Ben Voigt, Jul 01 '18 at 03:09

score 0 · Answer 5 · answered Jul 02 '18 at 08:41

This is mostly a theoretical thing. On real hardware, the same holds for signed integers as well. Obviously, with signed integers, some of those values are negative.

Back to unsigned - what the text says is basically that the value of an unsigned number is simply 1<<0 + 1<<1 + 1<<2 + ... up to the total number of bits. Importantly, not only are all bits contributing, but all combinations of bits form a valid number. This is NOT the case for signed integers. Therefore, if you need a bitmask, it has to be an unsigned type of sufficient width, or you could run into invalid bit patterns.

C++ and unsigned types

5 Answers5