Why does std::bitset suggest more available bits than sizeof says there are?

Question

I'm working on some simple bit manipulation problems in C++, and came across this while trying to visualize my steps. I understand that the number of bits assigned to different primitive types may vary from system to system. For my machine, sizeof(int) outputs 4, so I've got 4 char worth of bits for my value. I also know now that the definition of a byte is usually 8 bits, but is not necessarily the case. When I output CHAR_BIT I get 8. I therefore expect there to be a total of 32 bits for my int values.

I can then go ahead and print the binary value of my int to the screen:

int max=~0; //All my bits are turned on now
std::cout<<std::bitset<sizeof(int)*CHAR_BIT>(max)<<std::endl;

$:11111111111111111111111111111111

I can increase the bitset size if I want though:

int max=~0;
std::cout<<std::bitset<sizeof(int)*CHAR_BIT*3>(max)<<std::endl;

$:000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111

Why are there so many ones? I would have expected to have only 32 ones, padded with zeros. Instead there's twice as many, what's going on?

When I repeat the experiment with unsigned int, which has the same size as int, the extra ones don't appear:

unsigned int unmax=~0;
std::cout<<std::bitset<sizeof(unsigned int)*CHAR_BIT*3>(unmax)<<std::endl;

$:000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111

By the way, mad props for fully following the standard with sizeof() and CHAR_SIZE. Most people just take these things for granted, especially the later. — , Apr 23 '18 at 04:21
Thanks for the detailed answer, it makes sense to me now. I'm currently preparing for an upcoming interview so I'm trying to learn as much as possible about how these things work, glad I'm on the right track! — jonthalpy, Apr 23 '18 at 05:11

score 20 · Accepted Answer · 2018-04-23T12:20:52.743

20

The constructor of std::bitset takes an unsigned long long, and when you try to assign a -1 (which is what ~0 is in an int) to an unsigned long long, you get 8 bytes (64 bits) worth of 1s.

It doesn't happen with unsigned int because you are assigning the value of 4294967295 instead of -1, which is 32 1s in a unsigned long long

edited Apr 23 '18 at 12:20

answered Apr 23 '18 at 04:14

8

It's worth mentioning the terms "zero-extension" and "sign-extension" so that people know what to search for if they want to read further. – ildjarn Apr 23 '18 at 04:46
3

Since the question goes to great pain to maintain platform indepence (to the point of not assuming `CHAR_BIT` to be 8), you might want to mention that `~0 == -1` only holds for 2's complement representation. – Angew is no longer proud of SO Apr 23 '18 at 07:49
1

Hm, I don't see how you get 2147483647, did you mean 4294967295? – pipe Apr 23 '18 at 08:43
Indeed, the max value for `unsigned int` is `4294967295` (32 1s correspond to `2^32 - 1`). – cute_ptr Apr 23 '18 at 10:01
Ugh, that's what I get for answering questions late Sunday evening... Fixed. Thanks! – Apr 23 '18 at 12:01

cute_ptr · Answer 2 · 2018-04-23T07:39:02.143

When you write int max=~0;, max will be 32 bits filled with 1s, which interpreted as integer is -1.

When you write

std::bitset<sizeof(int)*CHAR_BIT>(max)
// basically, same as
std::bitset<32>(-1)

You need to keep in mind that the std::bitset constructor takes an unsigned long long. So the -1 that you pass to it, gets converted to a 64 bit representation of -1, which is 64 bits all filled with 1 (because you have a negative value, sign extension maintains it as such, by filling the 32 leftmost bits with 1s).

Therefore, the constructor of std::bitset gets an unsigned long long all filled with 1s, and it initializes the 32 bits you asked with 1s. So, when you print it, you get:

11111111111111111111111111111111

Then, when you write:

std::bitset<sizeof(int)*CHAR_BIT*3>(max)
// basically, same as
std::bitset<96>(-1)

The std::bitset constructor will initialize 64 rightmost bits of the 96 that you asked with the value of the unsigned long long that you passed, so those 64 bits are filled with 1s. The remaining bits (32 leftmost) are initialized with zeros. So when you print it, you get:

000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111

On the other hand, when you write unsigned int unmax=~0;, you're assigning all 1s to an unsigned int, so you get UINT_MAX.

Then, when you write:

std::bitset<sizeof(unsigned int)*CHAR_BIT*3>(unmax)
// basically, same as
std::bitset<96>(UINT_MAX)

The UINT_MAX that you pass, gets converted to a 64 bit representation, which is 32 rightmost bits filled with 1s and the remaining all 0s (because you have a positive value, sign extension maintains it as such, by filling the 32 leftmost bits with 0s).

So the unsinged long long that std::bitset constructor gets is represented as 32 0s, followed by 32 1s. It will initialize 64 rightmost bits of the 96 that you asked with 32 0s followed by 32 1s. The remaining 32 leftmost bits (of 96) are initialized with zeros. So when you print it, you get (64 0s followed by 32 1s):

000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111

Why does std::bitset suggest more available bits than sizeof says there are?

2 Answers2