That's a hack to turn the bfs
into tzcnt
on processors that support it. It sure would have warranted a comment in the code, though. To quote the instruction set reference:
0F BC /r BSF r32, r/m32
F3 0F BC /r TZCNT r32, r/m32
TZCNT counts the number of trailing least significant zero bits in
source operand (second operand) and returns the result in destination
operand (first operand). TZCNT is an extension of the BSF instruction.
The key difference between TZCNT and BSF instruction is that TZCNT
provides operand size as output when source operand is zero while in
the case of BSF instruction, if source operand is zero, the content of
destination operand are undefined. On processors that do not support
TZCNT, the instruction byte encoding is executed as BSF.
(The REP
prefix is F3
of course.)