Since characters from -128 to -1 are same as from +128 to +255, then what is the point of using unsigned char?

Question

#include <stdio.h>
#include <conio.h>
int main()
{
    char a=-128;
    while(a<=-1)
    {
        printf("%c\n",a);
        a++;
    }
    getch();
    return 0;
}

The output of the above code is same as the output of the code below

#include <stdio.h>
#include <conio.h>
int main()
{
    unsigned char a=+128;
    while(a<=+254)
    {
        printf("%c\n",a);
        a++;
    }
    getch();
    return 0;
}

Then why we use unsigned char and signed char?

`-128` and `+128` are different values, however `printf("%c"` performs a conversion that prints the same character for both cases. — M.M, Mar 22 '16 at 06:17
@SouravGhosh overflow of `char` cannot happen (except on `sizeof(int)==1` machines) because `char` is promoted to `int` before any arithmetic. The `int` arithmetic may overflow, but then that is also a problem for `unsigned char` because the latter also promotes to `int`. — M.M, Mar 22 '16 at 06:18
@M.M That is in case of `%c` argument, I was talking in general. — Sourav Ghosh, Mar 22 '16 at 06:19
What you are missing is the importance of the `signed` aspect of `char`. That provides the range of `-128 - 127` but has more subtle implications related to *sign-extention* during bitwise operations, etc.. On the other hand, `unsigned char` has a range of `0-255` and does not undergo sign extension related to its leftmost bit. — David C. Rankin, Mar 22 '16 at 06:20
`-128 to 0` is 129 numbers. `+128 to +255` is 128 numbers. Obvious problem with title: "characters from -128 to 0 are same as from +128 to +255". Fixed — chux - Reinstate Monica, Mar 22 '16 at 16:26

score 3 · Answer 1 · answered Mar 22 '16 at 06:19

K & R, chapter and verse, p. 43 and 44:

There is one subtle point about the conversion of characters to integers. The language does not specify whether variables of type char are signed or unsigned quantities. When a char is converted to an int, can it ever produce a negative integer? The answer varies from machine to machine, reflecting differences in architecture. On some machines, a char whose leftmost bit is 1 will be converted to a negative integer ("sign extension"). On others, a char is promoted to an int by adding zeros at the left end, and thus is always positive. [...] Arbitrary bit patterns stored in character variables may appear to be negative on some machines, yet positive on others. For portability, specify signed or unsigned if non-character data is to be stored in char variables.

WiSaGaN · Answer 2 · 2016-03-22T06:25:18.160

Because unsigned char is used for one byte integer in C89.

Note there are three distinct char related types in C89: char, signed char, unsigned char.

For character type, char is used.

unsigned char and signed char are used for one byte integers like short is used for two byte integers. You should not really use signed char or unsigned char for characters. Neither should you rely on the order of those values.

score 2 · Answer 3 · answered Mar 22 '16 at 06:21

The bit representation of a number is what the computer stores, but it doesn't mean anything without someone (or something) imposing a pattern onto it.

The difference between the unsigned char and signed char patterns is how we interpret the set bits. In one case we decide that zero is the smallest number and we can add bits until we get to 0xFF or binary 11111111. In the other case we decide that 0x80 is the smallest number and we can add bits until we get to 0x7F.

The reason we have the funny way of representing signed numbers (the latter pattern) is because it places zero 0x00 roughly in the middle of the sequence, and because 0xFF (which is -1, right before zero) plus 0x01 (which is 1, right after zero) add together to carry until all the bits carry off the high end leaving 0x00 (-1 + 1 = 0). Likewise -5 + 5 = 0 by the same mechanisim.

For fun, there are a lot of bit patterns that mean different things. For example 0x2a might be what we call a "number" or it might be a * character. It depends on the context we choose to impose on the bit patterns.

The biggest and best reason is because one type makes the code easier to understand in one context. For example, I could use `signed char` to store number of children a family might have, but that would be very confusing and permit negative children. Better to use a `unsigned char`, permitting a few more children (254 max) and completely making my code clear that the number is not negative. Finally, your `printf` implicitly casts to a character, if you want the decimal representation it is `printf("%d", ...)` which will show the expected difference in values — Edwin Buck, Mar 22 '16 at 06:28

chux - Reinstate Monica · Accepted Answer · 2016-03-22T16:42:44.453

With printing characters - no difference:

The function printf() uses "%c" and takes the int argument and converts it to unsigned char and then prints it.

char a;
printf("%c\n",a);  // a is converted to int, then passed to printf()
unsigned char ua;
printf("%c\n",ua); // ua is converted to int, then passed to printf()

With printing values (numbers) - difference when system uses a char that is signed:

char a = -1;
printf("%d\n",a);     // --> -1
unsigned char ua = -1;
printf("%d\n",ua);    // --> 255  (Assume 8-bit unsigned char)

Note: Rare machines will have int the same size as char and other concerns apply.

So if code uses a as a number rather than a character, the printing differences are significant.

score 0 · Answer 5 · answered Mar 22 '16 at 16:59

Different types are created to tell the compiler how to "understand" the bit representation of one or more bytes. For example, say I have a byte which contains 0xFF. If it's interpreted as a signed char, it's -1; if it's interpreted as a unsigned char, it's 255.

In your case, a, no matter whether signed or unsigned, is integral promoted to int, and passed to printf(), which later implicitly convert it to unsigned char before printing it out as a character.

But let's consider another case:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char a = -1;
    unsigned char b;
    memmove(&b, &a, 1);
    printf("%d %u", a, b);
}

It's practically acceptable to simply write printf("%d %u", a, a);. memmove() is used just to avoid undefined behaviour.

It's output on my machine is:

-1 4294967295

Also, think about this ridiculous question:

Suppose sizeof (int) == 4, since arrays of characters (unsigned char[]){UCHAR_MIN, UCHAR_MIN, UCHAR_MIN, UCHAR_MIN} to (unsigned char[]){UCHAR_MAX, UCHAR_MAX, UCHAR_MAX, UCHAR_MAX} are same as unsigned ints from UINT_MIN to UINT_MAX, then what is the point of using unsigned int?

Since characters from -128 to -1 are same as from +128 to +255, then what is the point of using unsigned char?

5 Answers5