11

In following program I used unsigned keyword.

#include <stdio.h>

int main()
{
        unsigned char i = 'A';
        unsigned j = 'B';
        printf(" i = %c j = %c", i, j);
}

Output:

 i = A j = B

Is unsigned char i equivalent to unsigned j?

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
msc
  • 33,420
  • 29
  • 119
  • 214
  • 2
    Possible duplicate of [what is the unsigned datatype?](http://stackoverflow.com/questions/1171839/what-is-the-unsigned-datatype) – Cody Gray - on strike Sep 06 '16 at 07:43
  • 1
    @CodyGray That's just the half of this question and doesn't address the printf argument promotion, which is the real reason why this code works. – Lundin Sep 06 '16 at 11:58
  • 1
    Hmm well, the question is pretty explicit in the title and the body about what is being asked, and only 1/3 of the answers thought it was important to discuss argument promotion. You might be biased, having posted one such answer yourself. :-) – Cody Gray - on strike Sep 06 '16 at 13:27

6 Answers6

24

Is unsigned char i equivalent to unsigned j?

No, when the type specifier is omitted and any of the signed, unsigned, short or long specifiers are used, int is assumed.

This means that the following declarations are equivalent:

long a;
long int a;

unsigned short b;
unsigned short int b;

The issue with the signedness of char is that when signed or unsigned is omitted, the signedness is up to the implementation:

char x; // signedness depends on the implementation
unsigned char y; // definitely unsigned
signed char z; // definitely signed
Blagovest Buyukliev
  • 42,498
  • 14
  • 94
  • 130
  • And the type specifier should never be omitted. This is programmation, we better be specific about what we want our code to do. – Tim Sep 06 '16 at 05:58
  • 4
    `signed`, `unsigned`, `short`, `long`, and `int` are *all* type specifiers. See [N1570](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) section 6.7.2 for an explanation of what various combinations of type specifiers mean. @TimF, `unsigned` is just as specific as `unsigned int` (though it doesn't hurt to write `unsigned int` if you find it clearer). – Keith Thompson Sep 06 '16 at 06:01
  • 5
    @TimF: I disagree. Plain `unsigned` is not ambiguous or anything - it's `unsigned int`, and every C programmer should know it, it's not some esoteric dark language corner. Most importantly, `unsigned` is long enough to type by itself, even without adding the redundancy of `int`. – Matteo Italia Sep 06 '16 at 06:04
  • @MatteoItalia Well maybe it's clear in the C standard anyway it's not obvious while reading and this question clearly shows that. You can of course state that "every C programmer should know that", but you prevent yourself from the case someone doesn't know that if you just specify the type. – Tim Sep 06 '16 at 06:11
  • 1
    `unsigned` *does* specify the type. – Keith Thompson Sep 06 '16 at 06:29
  • @KeithThompson: Since the standard treats both `int` and `signed` as type specifiers, we can reason only about combinations of them. Isn't it rational to assume that *some* type specifiers affect other type specifiers and maybe we could call the former "type modifiers" (though it is not strictly standard-conforming)? – Blagovest Buyukliev Sep 06 '16 at 06:35
  • @BlagovestBuyukliev: IMHO it's more reasonable to follow the terminology used buy the standard. – Keith Thompson Sep 06 '16 at 06:37
  • 1
    @TimF: OP is clearly learning the language; again, there are some corners of the language which is definitely excusable not to know, since they are extremely rare to see "in real life" (bitfields come to mind, or even just the `signed` keyword) or both rare and with absurd rules (the exact syntax for complex type declarations come to mind, especially when function pointers are involved); but I'd argue that this isn't one of those cases - `unsigned` alone is both widely used and an important example of how simple type declarations work. – Matteo Italia Sep 06 '16 at 06:47
  • 1
    @MatteoItalia Actually I had never seen this notation before. I usually see three types of notation, plain notation `unsigned int`, ctype.h notation `uintx_t` or custom defined notation `UINTX` . I have been writing C for 5-6 years now, maybe I'm not the best but I'm sure I'm not the worst either. If I didn't find this clear, I assume I won't be the only one. Is there a bonus at the end of the year for devs who write the less characters in their source files ? – Tim Sep 06 '16 at 06:57
  • 2
    @TimF: there's a bonus for developers who can look this stuff up the first time they encounter it, and then remember it, instead of complaining that other people should never write anything you don't already know ;-p – Steve Jessop Sep 06 '16 at 07:52
  • @SteveJessop Sure, that's also how I work. Still if there's a more explicit way to write a type, why not doing it ? IMHO the score of this question clearly shows that for some people it's ambiguous to only declare a `unsigned` variable. Whether people can look for an answer or not doesn't change the fact more people will understand if we give the full type specifier. Anyway don't worry for me, I'll remember it, but I'll probably never ever write `unsigned foo` in my code. – Tim Sep 06 '16 at 08:35
  • @TimF: "for some people it's ambiguous" well, some people don't know what it means, I don't think that's the same as it being ambiguous. As for why write it at all, for the same reason you write `int` when `signed int` would be more explicit: because you believe your audience will understand it and appreciate the more concise choice. If you're right you win, if you're wrong you lose (or the reader loses), if you think your audience doesn't know it then don't use it. Just like any idiom. – Steve Jessop Sep 06 '16 at 08:38
  • I think this is in essence the same argument that comes up when you write `1 + 2 << 3`. Is the reader expected to know C operator precedence, or should I add parentheses to that? What about `1 + 1 * 2`? What about `a + b ? c + d : e + f`? And so on until you've done enough code review and want to get on with something more exciting ;-) It just depends what you expect of your colleagues and other readers of your code, and is somewhat circular since if you use it in your code base then your habitual readers will eventually learn it. – Steve Jessop Sep 06 '16 at 08:44
  • This is just nonsense: "when the type specifier is omitted and any of the signed, unsigned, short or long specifiers are used, int is assumed." As Keith Thompson points out, these are _all_ type specifiers. So this answer doesn't make any sense and I understand nothing of it, since it is basically saying "when the cookie is omitted and any of theses cookies [list of cookies] are used, chocolate cookie is assumed". Eh? – Lundin Sep 06 '16 at 11:39
  • @Lundin: If we deviate a bit from standard-centric fetishism and try to formulate a rule of thumb for the average non-compiler-writing programmer, we can redefine a "type specifier" so that it excludes the aforementioned 4 keywords and treat the latter simply as "specifiers" or even, God forbid, "type modifiers" :-) The alternative is to interpret each combination in isolation which is a lot more to remember. – Blagovest Buyukliev Sep 06 '16 at 12:06
  • I don't know if it can help but there's a Documentation topic about it: http://stackoverflow.com/documentation/c/309/data-types/1083/integer-types-and-constants#t=201609061236256617884 – Giacomo Garabello Sep 06 '16 at 12:40
  • @BlagovestBuyukliev Even if you re-invent the C term to some personal, non-standard definition, I still don't see how `short` or `long` would end up as `int`. – Lundin Sep 06 '16 at 12:40
  • Both `short` and `long` are still *integers*, but of a possibly different size than `int` alone. It can be seen as a modification of the default size and signedness of a type. – Blagovest Buyukliev Sep 06 '16 at 13:16
  • @TimF How is `unsigned` vs `unsigned int` different from `long` vs `long int`? As far as I know the fact that people write `long` but (typically) don't write `unsigned` is an arbitrary cultural thing. – Random832 Sep 06 '16 at 14:15
  • @Lundin They end up as `signed short int` and `signed long int`, which are the full names of the types in question. `short` and `long` alone are abbreviations in the same way as `unsigned` here. – Random832 Sep 06 '16 at 14:17
  • @BlagovestBuyukliev: So you want to ignore what the standard says about type specifiers and invent a new category that you call "type modifiers" -- and you think this makes things simpler for new programmers to understand? – Keith Thompson Sep 06 '16 at 14:44
  • @KeithThompson: Yes, it is easier to remember this way instead of remembering the combinations that the standard talks about. – Blagovest Buyukliev Sep 06 '16 at 16:14
  • @Random832 You've got a point, it's an arbitrary cultural thing and from my arbitrary cultural point of view I find `unsigned int` and `long int` clearer, but my overall favorite goes for the `ctype.h` notation. It's not only that `signed long long int` or `long long` are unclear about what kind of data you're dealing with, it's also unclear (despite it's written in the spec) that these are the same. – Tim Sep 07 '16 at 05:51
10

unsigned, int and char are so-called type specifiers. The rules in C about type specifiers are mighty weird and irrational, sometimes for backwards-compatibility reasons.

Chapter 6.7.2/2 of the C standard goes like this:

  • signed char means signed char.
  • unsigned char means unsigned char.
  • char is a distinct type apart from the two above and can be either signed or unsigned. Depends on compiler.
  • short, signed short, short int, or signed short int means short.
  • unsigned short, or unsigned short int means unsigned short.
  • int, signed, or signed int means int.
  • unsigned, or unsigned int means unsigned int.
  • long, signed long, long int, or signed long int means long.

And so on. If you think that this system doesn't make much sense, it is because it doesn't.

Given the above, we can tell that unsigned char and unsigned are different types.

There is however a special case rule (called "the default argument promotions") for all variable-argument functions like printf saying that all small integer types such as char are implicitly promoted to type int. That's why you can printf("%d" on characters as well as integers. Similarly, %c works for integers, because it is actually impossible for printf to get a real, non-promoted character type through its parameters. Anything of type int or smaller will end up as int on printf's side.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 2
    `char` is a separate type to both `signed char` and `unsigned char` – M.M Sep 06 '16 at 06:31
  • @M.M Yes, but it is not important here. It is enough to know that it is either signed or unsigned. – Lundin Sep 06 '16 at 06:33
  • 2
    It's important that `char` does not mean either of `signed char` or `unsigned char`. They are three separate types. Your third point is at odds with the rest of your post; for example `signed short` and `short` are actually two different ways of writing the same type, but you wrote "means" to express both cases. – M.M Sep 06 '16 at 06:35
  • @M.M The difference is only important to people writing compilers and language-lawyer fetishists. I doubt that those are the main audience here. Anyway, I've edited the post to sate those too. – Lundin Sep 06 '16 at 06:42
  • And actual coders, for example `unsigned char s[] = "hello;"; strchr(s, 'e');` is a constraint violation even on systems where `char` is unsigned. It is somewhat common in real world code for `char` and `unsigned char` to be mixed like this. – M.M Sep 06 '16 at 06:49
  • 1
    More answers here should have mentioned the var-args promotion to int. – StoryTeller - Unslander Monica Sep 06 '16 at 06:52
  • 1
    @M.M Constraint violation of what rule? The "strict aliasing" rule only mentions "a character type", which is true for unsigned char and char both. – Lundin Sep 06 '16 at 06:54
  • _Depends on compiler_: actually on compiler and its settings: e.g. GCC has `-fsigned-char` and `-funsigned-char` options to set signedness of `char`. – Ruslan Sep 06 '16 at 07:30
  • "The difference is only important to people writing compilers and language-lawyer fetishists" -- well, it's important to someone who wants to know why it is that when a function takes `char*` they can't pass it either of `signed char*` or `unsigned char*`. Both fail with a compiler error about the wrong parameter type as in M.M's example. Whereas when a function takes `int*` they can pass it `signed int*`. So, compiler-writers, language fetishists, and the mildly curious. – Steve Jessop Sep 06 '16 at 07:56
  • @Lundin `char` and `unsigned char` are incompatible types, so `unsigned char *` cannot be passed to a function taking `char *` parameter – M.M Sep 06 '16 at 09:13
  • @M.M It is still not clear to me why a char, which is unsigned on a given implementation, would not be compatible with `unsigned char`. The standard's definition of compatible type is completely worthless: "Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers". But 6.7.2 doesn't even mention the word "compatible" anywhere. Seems all that's left after reading that is personal interpretation and opinions. – Lundin Sep 06 '16 at 11:27
  • @Lundin see C11 6.7.2.2/4 for mention of compatibility in that section. I don't see any room for personal interpretation and the definition of "compatible type" seems clear and useful to me. It sounds like you don't understand the definition. "compatible type" is why the code `float f; strcpy(&f, "x");` is invalid: `float` and `char` are incompatible. – M.M Sep 06 '16 at 11:35
  • @M.M Eh? 6.7.2.2/4 is about enums only. The only thing that seems to define the different types is the table I referred to in this answer. But nowhere in the initial text does it mention that each row in that list stands for a distinct type, nor does it mention compatible type. – Lundin Sep 06 '16 at 11:56
  • For completeness, it might also be worth mentioning that variables with _no_ type specifiers (in versions of C where that is allowed), e.g. static/extern/auto x, are [signed] int. (Also, technically, the rule allowing you to pass a small-valued unsigned int to a printf specifier that expects an int, or vice versa, is somewhat obscure) – Random832 Sep 06 '16 at 14:43
  • @M.M Ultimately, the fact that `char` and `signed/unsigned char` (of whichever is in fact the same signedness) are incompatible is "arbitrary", as with the fact that `long` is incompatible with whichever (if any) of `int` and `long long` is in fact the same size (or `short` and `int` on platforms where _those_ are in fact the same size). Float and char have different size, alignment, and representation, so it's not _really_ the same thing. – Random832 Sep 06 '16 at 14:49
  • @Random832 "Implicit int" has thankfully been removed from the language long time ago. I see no reason to mention that C used to be even more dysfunctional... the above type mess is bad enough. As for types, the integer promotion rule will always turn small integer types to (signed) `int`, no matter their original signedness.Yet another language flaw. – Lundin Sep 06 '16 at 15:10
  • @Lundin I'm talking about the fact that his `unsigned` [int] `j` works fine in `%c`. IIRC there's a specific rule to allow that case, otherwise it would be undefined behavior since unsigned int isn't the same type as int. – Random832 Sep 06 '16 at 15:30
  • @Lundin 6.7.2.2/4 talks about when a `enum` type is *compatible* with another type, that is what is being referred to by "Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers". – M.M Sep 06 '16 at 21:56
  • @Random832 you could say that any language rule is arbitrary (as evinced by the fact that for any rule you can find some other language that doesn't have the same rule). – M.M Sep 06 '16 at 21:57
5

No, they're not equivalent. unsigned char and unsigned are two different types (unsigned is equivalent to unsigned int). They're both unsigned integer types, and they can hold some of the same values.

The %c format specifier for printf expects an argument of type int (which is a signed integer type). An unsigned char argument is promoted to int, so that's ok. An unsigned int argument is not converted to int, but it happens to work in this case (for reasons I won't go into).

Note that character constants like 'A' are also of type int.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • No, the `%c` specifier expects type `char`. Though as it happens, variable argument list functions promote all small integer types to `int`. That's `printf`'s problem though - it will have to convert that `int` back to a `char` internally, in some implementation-defined way. But that's not something the programmer should need to concern themselves with. – Lundin Sep 06 '16 at 06:13
  • 5
    @Lundin: No, `%c` expects an argument of type `int`, and the value is converted to `unsigned char`. N1570 7.21.6.1p8. – Keith Thompson Sep 06 '16 at 06:26
3

Nopes, unsigned j is the same as unsigned int j.

According to C11, chapter §6.7.2, Type specifiers

unsigned, or unsigned int

So, it is evident that omitting the int still makes the variable as integer.

That said, to clear out any confusion, the spec lists

  • unsigned char

...

  • unsigned, or unsigned int

as two different type specifier, so evidently they are not the same.

Community
  • 1
  • 1
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
2

The answer is No.

Whenever you don't specify the type in following cases signed unsigned long short and you are assigning number by default it's taken as int.

In case of characters

char a; // Depends on implementations.
signed char b; // Signed
unsigned char c; // Unsigned
Shravan40
  • 8,922
  • 6
  • 28
  • 48
  • All those keywords *do* specify the type. They're all *type specifiers*. The standard specifies what various combinations of them mean. – Keith Thompson Sep 06 '16 at 06:04
  • @Lundin : You were right. Now i have edited my answer. – Shravan40 Sep 06 '16 at 06:22
  • 1
    This has nothing to do with if you're assigning a number; C is not dynamic and does not have type inference. It is true that `unsigned x` is equivalent to `unsigned int x`, `long` to `long int`, etc, but saying these are "taken as `int`" is misleading. @Lundin explaining why an answer is wrong is better than simply flatly saying "it's just plain wrong". – Random832 Sep 06 '16 at 14:23
  • @Random832 Usually, yeah. But since this answer seems to be nothing but shameless copy-pasta of another (confusing) answer, they can find the explanation in the comments of that other answer too. – Lundin Sep 06 '16 at 14:40
0

Depends a bit on the compiler and architecture, but in general an unsigned int is 4 or 8 bytes in length, an unsigned char is usually 1 octet in length.

That means an unsigned takes 4-to-8x the space and can hold a much larger number (an unsigned can, for example, hold the number 1,024 while a typical unsigned char can hold only up to 255). Depending on the endianness of your processor, the first octet of the unsigned might be the same as the unsigned char version (or not).

As long as you don't care about memory use, overflow, or aren't trying to do weird things with pointers, they're the same thing. But they start differing quickly as you start doing funkier things in C.

BJ Black
  • 2,483
  • 9
  • 15
  • An `unsigned char` is *always* 1 byte in length, by definition. (The number of bits in a byte can vary; it's always at least 8, and it's specified by `CHAR_BIT`.) – Keith Thompson Sep 06 '16 at 06:03
  • In any sane platform, yes. Various attempts to use UCS-16 as a char type have made that statement only 98% true. The spec only requires that a char be AT LEAST one byte. – BJ Black Sep 06 '16 at 06:07
  • 2
    No, for any conforming C implementation. That's how C defines the word "byte". A conforming implementation could make type `char` 16 bits, but `sizeof (char)` and `sizeof (unsigned char)` are still 1 by definition (and `CHAR_BIT` would be 16). "Byte" does not mean "octet". – Keith Thompson Sep 06 '16 at 06:08
  • "Conforming" is a big assumption. Part of the fun in C is that not every platform is sane. I've definitely seen bits/*.h that don't have that assumption (though admittedly not on a sane platform). – BJ Black Sep 06 '16 at 06:11
  • Though I'll admit the octet/byte differences can be subtle and weird. – BJ Black Sep 06 '16 at 06:12
  • Not in this case, I think. Can you cite *any* C implementation, conforming or not, in which `sizeof (char) != 1`? – Keith Thompson Sep 06 '16 at 06:13
  • Not without digging through some very old floppy disks in my shed. I'll fully admit that I've not seen a nonconforming implementation this century. – BJ Black Sep 06 '16 at 06:14
  • The chapter about the `sizeof` operator in the standard has a special paragraph regarding this, 6.5.3.4: `"When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1."` – Lundin Sep 06 '16 at 06:47
  • Interesting. Hadn't seen that before; learn something new everyday :-) – BJ Black Sep 06 '16 at 06:48
  • Answer modified for better specificity on byte-vs-octet. Thanks for the awesome discussion, all, and I'll endeavor to forget that mid-90s off-brand terminal SDK I played with a while back. – BJ Black Sep 06 '16 at 06:52
  • The most likely place you'll see chars that are not 8 bits are DSP chips where char might be 24 bits. I've seen C implementations (probably non-conforming) on these where all of the C data types are basically 24 bit. – PeterI Sep 06 '16 at 10:25
  • @PeterI: There's nothing non-conforming about a C implementation with 24 bit chars (or if you can think of a reason why a conforming non-hosted C implementation can't be written - efficiently - for such a platform, I am sure the committee would regard that as a defect). Hosted is harder because with a 24-bit char, int would probably be 24-bit too, which makes it harder to distinguish EOF and a valid result from `getc`. – Martin Bonner supports Monica Sep 06 '16 at 11:34
  • I should have said "possibly" not "probably" (doh). Wikipedias Motorola 56K entry has a link to their 56K gcc based compiler manual which does go into details of working on a system where sizeof(char)=sizeof(short)=sizeof(int). Since it was gcc based I'd expect it to be ansi conforming. – PeterI Sep 06 '16 at 15:59