Can someone explain overflow in C# using binary?

Question

I'm currently reading a book on C# programming and it has brief intro to Overflow and Underflow and the writer gives a general idea of what happens when you go over the allowed range of a certain type.

Example

short a = 30000;
short b = 30000;
short sum = (short)(a + b); // Explicitly cast back into short
Console.WriteLine(sum); // This would output the value -5536

So the short type only has a range of from -32768 to 32767 and in the book the explanation given by the writer is "For integer types (byte, short, int, and long), the most significant bits (which overflowed) get dropped. This is especially strange, as the computer then interprets it as wrapping around. This is why we end up with a negative value in our example. You can easily see this happening if you start with the maximum value for a particular type (for example, short.MaxValue) and adding one to it." The C# Players Guide Second edition, chapter 9, page 58

You'd end up at the minimum (-32768).

I'm having a problem understanding this I get confused when the writer talks about "the computer interprets it as wrapping around"

The way I tried understanding this was that the short type uses 2 bytes (16 bits)

So the number 32767 = 0111111111111111 if i was to +1 to the binary string I'd end up with 32768 = 1000000000000000 (which cannot be represented with the short type as max value is 32767) so the compiler gives -32768. Why does it end up as a negative?

I understand the concept of using twos compliment to represent negative numbers and could someone correct my thinking here or elaborate because i don't fully understand why we only use 15 bits of the 16 bits to represent the positive range and the most significant bit for the negative range

On signed values, one bit (the MSB) is reserved as the sign bit, to indicate if the number is positive or negative. How else could you tell if the number was negative or not? You're not accounting for that bit needing to be present. — Ken White, May 25 '16 at 23:47
Nothing much to explain. This is by convention. The msb is the sign bit. How else would you have it? — Luc Morin, May 25 '16 at 23:47
You say "I understand the concept of using twos compliment to represent negative numbers" but your question is "explain to me how twos complement works". I think you do not understand how twos complement works. — Eric Lippert, May 26 '16 at 00:03
So binary numbers that are negative would always have a 1 as the MSB (by convention) and unsigned binary would have to have a 0 as the MSB. I didn't know the convention that a one as the MSB represents a sign bit. I think the confusion was because I used "pure" binary representations I didn't really understand the twos compliment convention. — Shabubble, May 26 '16 at 00:07
Everyone who is telling you that the high bit is a sign bit is just adding to your confusion. This has nothing to do with sign bits. — Eric Lippert, May 26 '16 at 00:11
Now, when you learn about *floats*, those really *do* have a sign bit. The top bit of a float is the sign, and the rest of the float gives the magnitude. A fun fact: that means that there is both "positive zero" and "negative zero" that can be represented by a float! — Eric Lippert, May 26 '16 at 00:43

Eric Lippert · Accepted Answer · 2016-05-26T00:34:18.810

Ignore everyone who is telling you that the top bit is a sign bit. That is the wrong way to think about it.

The right way to think about it is:

We have 65536 possible bit patterns.
Therefore we can represent 65536 possible numbers.
We must have a map that assigns meaning to each bit pattern.

For unsigned shorts, we assign bit patterns as follows:

0000000000000000 --> 0
0000000000000001 --> 1
...
0111111111111111 --> 32767
1000000000000000 --> 32768
1000000000000001 --> 32769
...
1111111111111111 --> 65535

For signed shorts, we use the following convention:

0000000000000000 --> 0
0000000000000001 --> 1
...
0111111111111111 --> 32767
1000000000000000 --> -32768
1000000000000001 --> -32767
...
1111111111111111 --> -1

Simple as that.

Why do we use this convention?

Three reasons:

(1) The first 32K values are the same whether you are signed or unsigned. That is very convenient.

(2) In both conventions, "all zero bits" means zero.

(3) Because addition works exactly the same in both conventions!

We have

0000000000000000 --> 0

and we want to add one. We add one using binary rules and we get:

0000000000000001 --> 1

This works whether the short is signed or unsigned.

We have unsigned short:

1000000000000000 --> 32768

We wish to add one. We do so using binary rules, and we get the right answer:

1000000000000001 --> 32769

Same for signed shorts. We have

1000000000000000 --> -32768

and we wish to add one. We do so with binary rules and we get:

1000000000000001 --> -32767

Similarly, you can verify that by adding 1111111111111111 to any binary number, you get "one less", and therefore subtraction of one works as well as addition of one. You can then go on to show that in general, addition and subtraction work the same in both signed and unsigned arithmetic, which means the processor doesn't have to know whether the compiler thinks the values are signed or unsigned.

That is why we use twos complement: because the underlying math is exactly the same whether you are doing signed or unsigned arithmetic.

Notice that I said nothing whatsoever about a "sign bit" in there. The fact that the high bit is set for negative numbers is just a nice bonus. The real property we want is to only have to build hardware to do math once.

Twos complement is just a convention for assigning one of two possible meanings to bit patterns based on the type associated with the variable holding that bit pattern. As an industry, we chose this convention because it is cheap to make high performance hardware that uses this convention.

There are a great many other conventions that we could have chosen. For example, we could have said that for signed numbers we use the map:

0000000000000000 --> -32768
0000000000000001 --> -32767
...
0111111111111111 --> -1
1000000000000000 --> 0
1000000000000001 --> 1
...
1111111111111111 --> 32767

Notice that this is exactly the same as before except for the top bit!

In this interpretation, addition still works as we expect. But this map does not have the property that the first 32K values are the same between short and ushort, and it does not have the property that all zero bits means zero. So we do not use this convention, because it is less convenient. In this convention, converting from short to ushort would require doing an addition. Setting a short to zero requires setting the bytes to something other than zero. We could use a convention where the signed numbers were "in order", we just don't because it is painful to do so.

Nice explanation makes a lot more sense :) I'm having trouble understanding the "The first 32K values are the same whether you are signed or unsigned." — Shabubble, May 26 '16 at 00:31
Wait i think i know 1000000000000000 --> 32768 and 1000000000000000 --> -32768 have both the same bit pattern, I didn't see that before — Shabubble, May 26 '16 at 00:35
@Shabubble: What is the bit pattern for 123 in a signed short? What is the bit pattern for 123 in an unsigned short? They are the same. In fact they are all the same, right up to 32767. — Eric Lippert, May 26 '16 at 00:35
@Shabubble: 1000000000000000 is the first one that is *different* depending on whether it is short or ushort. — Eric Lippert, May 26 '16 at 00:36
I understand now, 123 ----> 01111011 for both signed and unsigned, Would my explanation for why the range halves for signed bits be correct. Lets say for example you have 65535 in binary, inverse the bits, add one (twos compliment) you would no longer have 2 bytes you would have 2 bytes and a bit 17 bits is it that in order to stay within range of 2 bytes you can only do twos compliment with numbers around the 32k range or (32768) and still stay within 2 bytes — Shabubble, May 26 '16 at 01:15

Can someone explain overflow in C# using binary?

1 Answers1