Representation of -1 due to bit overflow?

Question

Hey I was trying to figure out why -1 << 4 (left) shift is FFF0 after looking and reading around web I came to know that "negative" numbers have a sign bit i.e 1. So just because "-1" would mean extra 1 bit ie (33) bits which isn't possible that's why we consider -1 as 1111 1111 1111 1111 1111 1111 1111 1111

For instance :-

#include<stdio.h>
void main()
{
printf("%x",-1<<4);
}

In this example we know that –

Internal representation of -1 is all 1’s 1111 1111 1111 1111 1111 1111 1111 1111 in an 32 bit compiler.
When we bitwise shift negative number by 4 bits to left least significant 4 bits are filled with 0’s
Format specifier %x prints specified integer value as hexadecimal format
After shifting 1111 1111 1111 1111 1111 1111 1111 0000 = FFFFFFF0 will be printed.

Source for the above http://www.c4learn.com/c-programming/c-bitwise-shift-negative-number/

Left shift on negative numbers is undefined behaviour i think. — Osiris, Aug 20 '18 at 16:18
Are you familiar with [Two's Complement](https://en.wikipedia.org/wiki/Two's_complement)? This is different from a simple sign bit. — tadman, Aug 20 '18 at 16:20
@Osiris it's question from a valid question paper in the past. — MasterKas, Aug 20 '18 at 16:20
@Osiris: Indeed it does — [C11 §6.5.7 Shift operators](http://port70.net/~nsz/c/c11/n1570.html#6.5.7): _The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 x 2E2 , reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined._ — Jonathan Leffler, Aug 20 '18 at 16:20
@Osiris so basically on a 32 bit it's not possible to add -1 bit so we treat (33 bit) as maximum value i.e all 1's on 32 bit — MasterKas, Aug 20 '18 at 16:23
The sign bit is the first bit of the 32-bit integer, not some other 33rd bit. — eesiraed, Aug 20 '18 at 16:24
Did anyone notice that the OP is programming on a 16 bit system? His compiler is most likely not even C99 compliant. — chqrlie, Aug 20 '18 at 19:28

score 6 · Accepted Answer · answered Aug 20 '18 at 16:33

6

First, according to the C standard, the result of a left-shift on a signed variable with a negative value is undefined. So from a strict language-lawyer perspective, the answer to the question "why does -1 << 4 result in XYZ" is "because the standard does not specify what the result should be."

What your particular compiler is really doing, though, is left-shifting the two's-complement representation of -1 as if that representation were an unsigned value. Since the 32-bit two's-complement representation of -1 is 0xFFFFFFFF (or 11111111 11111111 11111111 11111111 in binary), the result of shifting left 4 bits is 0xFFFFFFF0 or 11111111 11111111 11111111 11110000. This is the result that gets stored back in the (signed) variable, and this value is the two's-complement representation of -16. If you were to print the result as an integer (%d) you'd get -16.

This is what most real-world compilers will do, but do not rely on it, because the C standard does not require it.

answered Aug 20 '18 at 16:33

TypeIA

16,916
1
38
52

OP reported "-1 << 4 (left) shift is FFF0" - look like a 16-bit value, not `0xFFFFFFF0 `. – chux - Reinstate Monica Aug 20 '18 at 16:35
@chux I doubt it, more likely sloppiness on the OP's part. Real world 16 bit machines are rare these days, and the "sample output" in the question shows 32-bit values. But even if so, it doesn't change the answer - this is correct for any platform with a standard-compliant C compiler. – TypeIA Aug 20 '18 at 16:37
100 of millions of embedded processor per year are 16 bit theses days, not rare. – chux - Reinstate Monica Aug 20 '18 at 16:38
@chux I see no evidence the OP is using one of those, and again, even if they are, it does not change the validity of the answer. – TypeIA Aug 20 '18 at 16:39
The "-1 << 4 (left) shift is FFF0" is evidence, yet learners often mis-state questions. The other UB is printing an `int` with `'%x"` that is out of `unsigned` range. The key weakness in explaining UB comes when conveying some sort of correctness and overstating it as the behavior of "most real-world compilers" without reference. This answer does not provide a _valid_ explanation anymore than any behavior is valid with UB. It does provide a reasonable example of UB though. The last line is best "do not rely on it". – chux - Reinstate Monica Aug 20 '18 at 16:48
@chux There is still educational value in understanding what a particular compiler is doing even in the face of UB, which is why I've included it in the answer together with very clear warnings that it is in fact UB. I don't view that as a weakness. – TypeIA Aug 20 '18 at 16:53
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/178370/discussion-between-typeia-and-chux). – TypeIA Aug 20 '18 at 17:04

score 4 · Answer 2 · edited Jun 20 '20 at 09:12

First thing first, the tutorials use void main. The comp.lang.c frequently asked question 11.15 should be of interest when assessing the quality of the tutorial:

Q: The book I've been using, C Programing for the Compleat Idiot, always uses void main().

A: Perhaps its author counts himself among the target audience. Many books unaccountably use void main() in examples, and assert that it's correct. They're wrong, or they're assuming that everyone writes code for systems where it happens to work.

That said, the rest of the example is ill-advised. The C standard does not define the behaviour of signed left shift. However, a compiler implementation is allowed to define behaviour for those cases that the standard leaves purposefully open. For example GCC does define that

all signed integers have two's-complement format
<< is well-defined on negative signed numbers and >> works as if by sign extension.

Hence, -1 << 4 on GCC is guaranteed to result in -16; the bit representation of these numbers, given 32 bit int are 1111 1111 1111 1111 1111 1111 1111 1111 and 1111 1111 1111 1111 1111 1111 1111 0000 respectively.

Now, there is another undefined behaviour here: %x expects an argument that is an unsigned int, however you're passing in a signed int, with a value that is not representable in an unsigned int. However, the behaviour on GCC / with common libc's most probably is that the bytes of the signed integer are interpreted as an unsigned integer, 1111 1111 1111 1111 1111 1111 1111 0000 in binary, which in hex is FFFFFFF0.

However, a portable C program should really never

assume two's complement representation - when the representation is of importance, use unsigned int or even uint32_t
assume that the << or >> on negative numbers have a certain behaviour
use %x with signed numbers
write void main.

A portable (C99, C11, C17) program for the same use case, with defined behaviour, would be

#include <stdio.h>
#include <inttypes.h>

int main(void)
{
    printf("%" PRIx32, (uint32_t)-1 << 4);
}

"-1 << 4 on GCC is guaranteed to result in -16" --> Even if `int` is 2's complement, does gcc _guaranteed_ that behavior? With gcc's many optimization abilities, I find it curious that compiler would specify that restrictive behavior, versus the potential speedier UB. — chux - Reinstate Monica, Aug 20 '18 at 16:55
Thanks. I see GCC has it in [Bitwise operators act on the representation of the value including both the sign and value bits, where the sign bit is considered immediately above the highest-value value bit.](https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Integers-implementation.html#Integers-implementation) — chux - Reinstate Monica, Aug 20 '18 at 18:00

Representation of -1 due to bit overflow?

2 Answers2