2

I want to be sure I understand what happens if two ints of different width are bitwise OR'ed with eachother. The most sensible option is to left-pad the smaller one with zeroes. I wrote a small program to test this.

Look at this sample code:

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

int main(void)
{
    uint32_t foo = 0x00000000;
    uint8_t bar = 0xFF;
    printf("%"PRIu32"\n", (foo | bar));
    printf("%"PRIu32"\n", (bar | foo));
}

If my guess was right, I should expect to get 255 twice. When I run this, I get

255
255

Is this expected and well-defined behavior that it's safe to rely upon? Is there a link explaining all of the behaviors of bit-manipulation with different int widths?

DJMcMayhem
  • 7,285
  • 4
  • 41
  • 61
  • 2
    You don't use `int`, but `uint32_t`. This is an unsigned integer type, but not necessarily `unsigned int` and definitively not `(signed) int`. You also invoke undefined behaviour by passing wrong varadic arguments types to `printf`. Use the correct format specifiers for `uint32_t` (see `inttypes.h`)! – too honest for this site Feb 23 '16 at 19:48
  • 1
    And about the bitops: This should be found online by a very simple search or in every C book. – too honest for this site Feb 23 '16 at 19:51
  • @Olaf Thanks for pointing that out. I'm not very good with `printf`. I edited that in my code and got the same result, so I'll edit my post so that it doesn't distract from my real question. – DJMcMayhem Feb 23 '16 at 19:51
  • 2
    Please read again! You now use another wrong specifier. Read my comment again. – too honest for this site Feb 23 '16 at 19:52
  • @Olaf is that better? – DJMcMayhem Feb 23 '16 at 19:56
  • Yes, it is. But As you use hex constants, you might want th printout in hex, too;-) Also, as you correctly used the specifier for `uint32_t`, I think you already know the answer to your question. – too honest for this site Feb 23 '16 at 20:00
  • Still there's not guarantee the result of operator | is `uint32_t`: it's the wider one of `uint32_t` and `unsigned int`. `PRIu32` reduces the chance of UB because there are more implementations with 16 bit int than 64 bit int, yet still depends on "chance". – user3528438 Feb 23 '16 at 20:25
  • 1
    @user3528438 actually it is the wider one of `uint32_t` and `signed int`. – M.M Feb 23 '16 at 20:28
  • @M.M Yep, you got it. – user3528438 Feb 23 '16 at 20:31

2 Answers2

3

As per C11 standard, chapter §6.5.12, Bitwise inclusive OR operator

Each of the operands shall have integer type.

and

The usual arithmetic conversions are performed on the operands.

So, the bitwise operation should be fine.

However, in case of printf(), the %d expects int argument and you're supplying unsigned int value. That is undefined behavior.

You can use PRIu32 macro to print uint32_t from inttypes.h.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • 1
    %d for unsigned int is undefined behaviour if the value is larger than INT_MAX. %ud for int is undefined behaviour if the value is negative. For values in the common range, signed or unsgined format for signed or unsigned values is fine. – gnasher729 Feb 23 '16 at 19:59
  • @gnasher729 what is `%ud`? – Sourav Ghosh Feb 23 '16 at 20:01
  • @SouravGhosh A `%u` directive and then a character `d` of course. – fuz Feb 23 '16 at 20:02
  • @SouravGhosh: With a little help, OP now got the type specifiers right. Not sure if he edited after you showed them (too imo) prominently or found out himself (which would have been the better way). – too honest for this site Feb 23 '16 at 20:07
  • @Olaf If OP has _got_ it correct, that will be good. However, somethimes, question edits are confusing. :) – Sourav Ghosh Feb 23 '16 at 20:10
  • @Olaf I'm afraid you're right, `inttypes.h` is still missing... :( – Sourav Ghosh Feb 23 '16 at 20:15
  • 1
    @user3528438 Depends which is wider, `uint32_t` or `unsigned`. `(foo | bar)` will be the wider type. – chux - Reinstate Monica Feb 23 '16 at 20:28
  • @chux actually it is `uint32_t` vs `signed` . However `PRIu32` should be fine either way because the resulting value must be in range for `uint32_t` even if `int` is chosen. – M.M Feb 23 '16 at 20:30
  • @M.M I don't know if the range matters in this case in terms of UB. I think a cast is necessary if we want to get away from UB. – user3528438 Feb 23 '16 at 20:34
  • @user3528438 the printf specification is not precise but my interpretation is that `%u` may be used with signed ints with non-negative value; `%hu` may be used with signed or unsigned `int` whose value is in the range of `unsigned short` , and so on. This is because of the default argument promotions; such a value is indistinguishable from a correct argument that's been promoted. – M.M Feb 23 '16 at 20:39
1

Your code is doing something different behind the scenes than you probably expect:

uint32_t foo = 0x00000000;
uint8_t bar = 0xFF;
printf("%zu\n", (foo | bar));

Let's assume that the type uint32_t is on your system an unsigned int. In that case, the expression (foo | bar) is handled by the compiler in the following way:

  • first, bar is changed (promoted) to type int (a process known as integer promotion) - this does not change its mathematical value of 255 at all.
  • then, the resulting int is converted to unsigned int, because the other argument of | is of type unsigned int. Again, nothing happens to the mathematical value, it's still 255.
  • Finally, the result is, as you expect, 255, and it's of type unsigned int.

The relevant topics for you to take a look at are C's handling of integer promotion and implicit conversion rules.

Dirk Herrmann
  • 5,550
  • 1
  • 21
  • 47
  • 1
    There are no "integral promotions", but _integer promotions_. And there is not coercion to `int` (unless `int` has more than 32 bits, of course, but then there is no conversion to `unsigned int` in the second place). – too honest for this site Feb 23 '16 at 20:02
  • 1
    @Olaf: Often enough I am wrong, but here you lost me. – Dirk Herrmann Feb 23 '16 at 20:09
  • 1
    Can you please clarify what you mean? I just follow the language standard. This is a good idea in general, but vital for "language-lawyer" questions. (Although this one is very clear). – too honest for this site Feb 23 '16 at 20:10
  • 1
    @Olaf: bar is of type uint8_t, and thus is definitely promoted to int. And, the usual arithmetic conversions apply, meaning that the signed int will be converted to unsigned int. Regarding integral promotion: http://stackoverflow.com/questions/10660758/integral-promotion. – Dirk Herrmann Feb 23 '16 at 20:11
  • For the two-step conversion: You are basically right (had to dig a bit in the standard to follow that trail), but as we have an `unsigned char`, there is no difference to a single conversion step (and in reality there is also none for other coercions). Anyway, the standard does not use the term "integral promotion", but "integer promotions" (6.3.1.1p2). And we should use this term, not one from a third-hand source which just add confusion. – too honest for this site Feb 23 '16 at 20:24
  • 1
    @Olaf: You are right about the standard not using the term integral promotion - will fix that. – Dirk Herrmann Feb 23 '16 at 20:27
  • 2
    There _are_ "integral promotions". In C89 ;-) – chux - Reinstate Monica Feb 23 '16 at 20:31
  • 2
    This answer is correct but there are more possible cases than just `uint32_t` being `unsigned int`, it would be good to cover all cases – M.M Feb 23 '16 at 20:32
  • 2
    @Olaf It is correct that `bar` is promoted to `int` and then to `uint32_t` (in the case of int being 32-bit). See C11 6.3.1.8/1 "Otherwise, the integer promotions are performed on both operands. *Then* the following rules are applied to the promoted operands: [...] the operand with signed integer type is converted to the type of the operand with unsigned integer type." – M.M Feb 23 '16 at 20:35
  • 1
    @M.M: I thought my last comment made clear I accepted this. Sorry, if I didn't state that clear enough. – too honest for this site Feb 23 '16 at 20:48