0

I have been trying to figure out the whole shifting process and it just doesn't make sense to me. How does it count the number of bits there were set and the right shift just throws away whatever calculation you had earlier?

A function loop_while has the following overall structure:

long loop_while(unsigned long x)

{

      long val= 0;

      while ( ... ) {

            ...

            ...

      }

      return ...;

}

GCC generates the following assembly code:

    long fun_a(unsigned long x)

    x in %rdi

 1     fun_a:

 2     movl $0, %eax

 3     jmp .L5

 4 .L6:

 5     xorq %rdi, %rax

 6     shrq %rdi             Shift right by 1

 7 .L5:

 8     testq %rdi, %rdi

 9     jne .L6

10     andl $1, %eax

11     ret

We can see that the compiler used a jump-to-middle translation, using the jmp instruction on line 3 to jump to the test starting with label .L2.

Describe what this function computes.

Select one: a. This code computes the sign of argument x. That is, it returns 1 if the most significant bit in x is 0 and 0 if it is 1.

b. This code computes the parity of argument x. That is, it returns 1 if there is an even number of ones in x and 0 if there is an odd number.

c. This code computes the parity of argument x. That is, it returns 1 if there is an odd number of ones in x and 0 if there is an even number.

d. This code computes the sign of argument x. That is, it returns 1 if the most significant bit in x is 1 and 0 if it is 0.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
tomatto
  • 149
  • 10
  • 2
    It doesn't count the number of set bits, only whether it's even or odd. It does that by `xor`ing all the set bits together. Since that `xor` toggles between 0 and 1 every time, if you get a 1 as result that means there were an odd number of bits. The shift is for removing the least significant bit that was already processed by the `xor` the result of which is in `eax`. Remember, at&t uses `src, dst` order. – Jester Jan 19 '23 at 14:50
  • 2
    The computation does 64-bit `xor`, but is ultimately only interested in the LSB of `rax`. With `xor` 0 as LSB in `rdi` does nothing (leaves the value in `rax` alone), while 1 toggles whatever was already there. So, any two 1's in `rdi` cancel each other out to become a 0 in `rax`. Each of the 64-bits are `xor`'ed (one iteration at a time) into the LSB position of `rax`, though the loop will early out after the last 1 has been found (if any). – Erik Eidt Jan 19 '23 at 15:36
  • You should get more efficient asm from `__builtin_parityll(x)`. (https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html). https://godbolt.org/z/rY6fqnYsx GCC uses `popcnt(x) & 1` if available, otherwise shift/xor to reduce the width in half repeatedly until down to 8 bits so it can use the parity flag. This is much faster than looping 64 times! – Peter Cordes Jan 19 '23 at 16:43

0 Answers0