0

I am writing an LLVM pass module to instrument every single memory operation in a program, and part of my logic needs to do some very hot binary logic on pointers.

How can I implement "bit ? u64_value : zero" in as few cycles as possible, preferably without using an explicit branch? I have a bit in the least significant bit of a register, and a value (assume u64) in another. If the bit is set, i want the value preserved. If the bit is zero, I want to zero out the register.

I can use x86 BMI instructions.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Andy
  • 237
  • 3
  • 14
  • 1
    Is it possible to represent the bit differently? For example, how about having a register contain `-1` if the bit is set, 0 otherwise? If you could do so, you could use a single `and` instruction to apply the mask. – fuz Dec 21 '18 at 20:58

2 Answers2

2

On AMD, and Intel Broadwell and later, CMOV is only 1 uop, with 1 cycle of latency. Or 2 uops / 2 cycles on Haswell and earlier. It's your best bet for conditionally zeroing a register.

xor  r10d, r10d   # r10=0.  hoist out of loops if possible

test    al, 1           # test the low bit of RAX, setting ZF
cmovz   rax, r10        # zero RAX if the low bit was zero, otherwise unmodified

(test r64, imm8 encoding doesn't exist, so you want to use the low-8 register if you're testing a mask that's all zero outside the low 8 bits.)

If the bit-position is in a register, bt reg, reg only 1 uop on Intel and AMD. (bts reg,reg is 2 uops on AMD K8 through Ryzen, but plain bt that sets CF according to the value of the selected bit is cheap on AMD and Intel.)

bt     rax, rdx      # CF = RAX & (1<<rdx)
cmovnc rax, r10

With both of these, the register you test can be different from the CMOV destination.

See https://agner.org/optimize/ for more performance info, and also https://stackoverflow.com/tags/x86/info

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
1

select is your friend. It mostly compiles to cmov but the backends will take care it even if not. Semantically it's "if arg1 is true then arg2 else arg3", rather like ?: in C/C++/java. In the C++ API you call SelectInst::Create(yourBool, yourInputValue, ConstantInt::get(i64, 0), instructionName, currentBlock);.

You'll find life easier if you can concoct meaningful names for instructions. It doesn't matter at first, but as your code grows it simplifies debugging more and more.

arnt
  • 8,949
  • 5
  • 24
  • 32