As Barmar guessed, clang compiles them both the same way across multiple architectures, doing whatever its cost model says is best on that architecture. Godbolt shows clang using and edi, -128
on x86-64, where that's a 3-byte instruction since the constant fits in a sign-extended imm8. shr edi, 7
is also 3 bytes, but can't run on as many execution ports on typical modern x86 CPUs.
But on AArch64, clang compiles both to cmp w8, w0, lsr #7
against a constant loaded into w8
(then cset
to materialize a boolean; I did that instead of making the compiler branch). GCC does make different asm that basically does the source operations, leading to worse asm on AArch64 where it does
# GCC12.2 -O3 for AArch64 for the & version
and w0, w0, -128
sub w0, w0, #1769472
subs w0, w0, #128
cset w0, eq
ret
instead of
# GCC12.2 -O3 for AArch64 for the >> version
# also clang for both versions does this
mov w1, 13825
cmp w1, w0, lsr 7 // AArch64 can use shifted source operands for some insns
cset w0, eq
ret
With GCC for RISC-V, neither one seems worse with these constants, since neither 0x001B0080U
nor 0x001B0080U>>7
fit in 12 bits. The AND constant, ~0x7Fu
aka -0x80u
, does fit in a sign-extended immediate for andi
.
RISC-V always sign-extends immediates. Unlike MIPS which zero-extends immediates for bitwise booleans like andi
.
You might see a difference on MIPS where your 0x3601
can fit in a 16-bit immediate but larger values can't. Indeed, yeah, the 2nd one is great on MIPS gcc (srl
/ xori
to get a 0 or non-zero value, and sltu
to booleanize), but clang makes them both bad. (Godbolt).
Fortunately MIPS is basically obsolete, so I wondered if clang would make the same mistake for a constant that could be 12-bit on RV32. (Also in the last Godbolt link). Clang makes good asm for RV32, using a shift for both, while GCC for RV32 actually does the AND and has to materialize 0x123<<7
with 2 instructions.
BTW, even with the smaller immediates, clang for MIPS is so bad it misses the fact that it could have used xori
instead ori $2, $zero, 0x9180
/ xor reg,reg,reg
; maybe someone forgot to teach clang that xori
zero-extends its immediate? It does know it can't use xori for x ^ (0xffffffff<<7)
.