Bitwise & over 32 bits

Question

#include <stdio.h>
#include <limits.h>

int main()
{
    unsigned long long a = 9223372036854775808; // a = 2^63
    a = a & ~1;
    printf("%llu\n", a);
    printf("%d, %lld", INT_MAX, LLONG_MAX);
}

Output

9223372036854775808
2147483647, 9223372036854775807

This is the semantic of the ~ in C17(with my bold), 6.5.3.3.4.

The result of the ~ operator is the bitwise complement of its (promoted) operand (that is, each bit in the result is set if and only if the corresponding bit in the converted operand is not set). The integer promotions are performed on the operand, and the result has the promoted type. If the promoted type is an unsigned type, the expression ~E is equivalent to the maximum value representable in that type minus E.

This is the semantic of the unary & in C17, 6.5.10.3.

The usual arithmetic conversions are performed on the operands.

a is equal to 9223372036854775808 which is equal to 8000 0000 0000 0000(16).

An integer constant 1 without any suffix is the same as (int)1. Therefore ~1 == ~(int)1 == FFFF FFFE(16)(integer promotion doesn't happen by ~).

The type of FFFF FFFE(16) is converted to unsigned long long at
a = 8000 0000 0000 0000(16) & FFFF FFFE(16) by the usual arithmetic conversion. Therefore
a = 8000 0000 0000 0000(16) & FFFF FFFE(16) ==
a = 8000 0000 0000 0000(16) & 0000 0000 FFFF FFFE(16)

Finally,
a = a & ~1 ==
a = 8000 0000 0000 0000(16) & FFFF FFFE(16) ==
a = 8000 0000 0000 0000(16) & 0000 0000 FFFF FFFE(16) ==
a = 0000 0000 0000 0000(16)

But the output says as if
a = a & ~1 ==
a = 8000 0000 0000 0000(16) & FFFF FFFE(16) ==
a = 8000 0000 0000 0000(16) & FFFF FFFF FFFF FFFE(16) ==
a = 8000 0000 0000 0000(16)

My question is how the output is shown?

You might want to use `uint64_t` instead of `unsigned long long int` not only because it's less typing, but because it makes it clear what it is. — tadman, Nov 10 '20 at 09:33
Hint: use `%x` rather than `%d` and use hexadecimal constants here (e.g. `0x8000`), it makes everything much simpler. I have no idea what `9223372036854775808` is but `0x8000000000000000` is clear — Jabberwocky, Nov 10 '20 at 09:35
For your code I get `main.c:6:28: warning: integer constant is so large that it is unsigned`, which I can avoid by using `9223372036854775808ull`. — Yunnosch, Nov 10 '20 at 09:35
Try `printf("%x, %llx", INT_MAX, ULLONG_MAX);`. Note the `U...` (the `x` is not so important). — Yunnosch, Nov 10 '20 at 09:40
When `0xFFFF FFFE` is promoted from 32 bits to 64, the value is `0xFFFF FFFF FFFF FFFE`. In other words, it's sign extended to 64 bits, and then converted to unsigned. Converting to unsigned does nothing, but sign extension sets all of the upper bits. Of course, you can avoid the whole issue by using `~1ULL` in place of `~1`. — user3386109, Nov 10 '20 at 09:43
@user3386109 But in C17 6.3.1.3 1, "... , if the value can be represented by the new type, it is unchanged". The value `0xFFFF FFFE` can be represented by the `unsigned long long`. So It should be converted to `0x 0000 0000 FFFF FFFE`. Can you give me that part of documentation? — op ol, Nov 10 '20 at 09:53
@opol You see hex digits. The **compiler** just sees a number, and that number is negative. So the conversion from 32 bits to 64 bits must maintain that negative value. Assuming the implementation uses 2's complement, that negative value is maintained by copying the sign bit into the upper 32 bits of the 64 bit value. This isn't crystal clear in the spec because the spec doesn't mandate 2's complement. It still allows 1's complement and signed-magnitude representations of integer values. — user3386109, Nov 10 '20 at 10:03
@user3386109 I overlooked that `0xFFFF FFFE` is negative in 2's complement system. This value is converted to `-2 + (2^64-1) + 1` and it is the same as `0xFFFF FFFF FFFF FFFE`. Thank you. — op ol, Nov 10 '20 at 10:09
"Therefore ~1 == ~(int)1 == FFFF FFFE(16)" is the beginning of the mistake. `~(int)1` may have the bit pattern FFFF FFFE(16), but its value did not change: still -2(16). — chux - Reinstate Monica, Nov 10 '20 at 11:31

0___________ · Accepted Answer · 2020-11-10T10:20:55.003

3

~1 in the twos complement system (all modern system - including your PC use it) is -2.

The binary representation of -2 in 4-byte long integer is 0xfffffffe.

When you promote it to long long integer the value -2 is preserved but the binary representation changes: 0xffffffffffffffffe. This value is binary AND-ed with your variable a - so its value remains unchanged.

if you want to prevent this behaviour you need to tell the compiler what size data you want to use:

    a = a & ~(unsigned)1;

and it will behave as you expect

https://godbolt.org/z/G6757W

edited Nov 10 '20 at 10:20

answered Nov 10 '20 at 10:15

0___________

60,014
4
34
74

1

"When you promote it to long long integer" --> Why discuss `long long` as `-2` is converted to `unsigned long long` as part of `a & ~1`? – chux - Reinstate Monica Nov 10 '20 at 11:28
1

Indeed. There is no `long long` anywhere and no conversion to `long long` takes place. I don't understand this answer. – Lundin Nov 10 '20 at 12:04

Ian Abbott · Answer 2 · 2020-11-10T13:09:33.573

I assume the normal signed integer types such as int have a 2's complement representation in the following answer, otherwise the numeric value of ~1 would be something other than -2.

The integer constant 1 has type int. The expression ~1 will have the value -2 and type int. Therefore, a = a & ~1; is equivalent to a = a & -2;.

Since a has type unsigned long long and ~1 has type int, then by the usual arithmetic conversions, the expression ~1 (numeric value -2) is converted to a value of type unsigned long long by mathematically adding (with infinite width) one more than ULLONG_MAX to the numeric value. Therefore a = a & ~1; is equivalent to a = a & -2; which is equivalent to a = a & (ULLONG_MAX - 1);

Since a has the value 0x8000000000000000ull (equivalent to 9223372036854775808ull) and the least significant 64 bits of ULLONG_MAX have the value 0xffffffffffffffffull, then the least significant 64 bits of (ULLONG_MAX - 1) have the value 0xfffffffffffffffeull. Since 0x8000000000000000ull & 0xfffffffffffffffeull is equal to 0x8000000000000000ull and any bits beyond the first 64 bits of a are all zero, the value of a will be unchanged by the assignment.

Yunnosch · Answer 3 · 2020-11-10T11:56:03.800

1

It took me some time to get what actually is the question.
So I was too late to beat P__J to the actually enlightening technical answer.

So I change my answer to a visualisation of the other answer.


#include <stdio.h>
#include <limits.h>

int main()
{
    unsigned long long a = 9223372036854775808ull; // a = 2^63
    printf("%llx\n", a);
    printf("%d\n", ~1);
    printf("%x\n", (unsigned) ~1);
    printf("%llx\n", (unsigned long long )~1);
    a = a & ~1;
    printf("%llx\n", a);
    printf("%x, %llx, %llx", INT_MAX, LLONG_MAX, ULLONG_MAX);
}

Which gets you an output of

8000000000000000
-2
fffffffe
fffffffffffffffe
8000000000000000
7fffffff, 7fffffffffffffff, ffffffffffffffff

It does not have any implausibilities in my opinion.

I used some proposals from the comments, especially by Jabberwocky.

edited Nov 10 '20 at 11:56

answered Nov 10 '20 at 09:45

Yunnosch

26,130
9
42
54

You and other people give me some developmental suggestion. I tried that all. But a difference is only the code becomes easy to read. Values and result are the same as before. I want to know why the result is `0x8000000000000000`. I expected `0` and the reason is in my question. I'm sorry my question is not clear. – op ol Nov 10 '20 at 09:59
1

@opol it is because `-2` is converted to 8 bytes integer - as in my answer\ – 0___________ Nov 10 '20 at 10:17
1

Now, too late, I got the question and have found the answer. I take second place in visualising the accepted answer. – Yunnosch Nov 10 '20 at 10:26
1

`printf("%llx\n", ~1);` is simply undefined behavior and possibly gives an out of bounds access. – Lundin Nov 10 '20 at 11:40
@Lundin Thanks. Tricky little nasal demons. Hope I got it right now. And took the opportunity for another interesting step. – Yunnosch Nov 10 '20 at 11:56

chux - Reinstate Monica · Answer 4 · 2020-11-10T11:40:42.037

"Therefore ~1 == ~(int)1 == FFFF FFFE(16)" is the beginning of the mistake.

~(int)1 may have the bit pattern FFFF FFFE(16), but its value does not change: still -2(10) or -2(16). It does not have the value of 4,294,967,294(10) or FFFF,FFFE(16).

When -2(16) is converted to unsigned long long, ULLONG_MAX + 1^* is added to become 18,446,744,073,709,551,614(10) or FFFF,FFFF,FFFF,FFFE(16).

  8000 0000 0000 0000(16)
& FFFF FFFF FFFF FFFE(16)
= 8000 0000 0000 0000(16)

^* When unsigned long long is 64-bit.

score 1 · Answer 5 · answered Nov 10 '20 at 12:30

Indeed no integer promotion takes place for the ~ operand since it is already int.
a & ~1; is equivalent to a & -2 (given 2's complement).
The operand a is of type unsigned long long, the operand -2 is of type int.
Promotion happens according to the usual arithmetic conversions (C17 6.3.1.8):

Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Which means that the rule regarding signed to unsigned conversion applies (C17 6.3.1.3):

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
The above refers to the pure mathematical calculation without caring about types. The maximum value ULLONG_MAX is 2^64 - 1. Add "one more" according to the above rule:

-2 + 2^64 - 1 + 1 = 18446744073709551614 = 0xFFFFFFFFFFFFFFFE
Now as it happens, this is the very same value as -2 would get when sign extended to signed long long on 2's complement, but no such conversion to a larger signed type actually takes place anywhere.
0x8000000000000000 & 0xFFFFFFFFFFFFFFFE gives 0x8000000000000000.

Bitwise & over 32 bits

5 Answers5