0

Thanks for taking a look at this question.

I saw the following piece of code inside a traditional for block, but was not sure what its significance was inside its context.

index <<= 1;

For further context, here is the full block of code.

ulong index = 1;
int distance = 0;
for (int i = 0; i < 64; i++)
{
    if ((hash1 & index) != (hash2 & index))
    {
        distance++;
    }

    index <<= 1;
}

Is it simply making sure that index is still 1 and if it isn't, return it's value to 1?

Secondly, what is this called so I can read up on it some more.

Finally, Thank you for your time and consideration for this matter.

MrB
  • 424
  • 2
  • 9

2 Answers2

8

The code in question is spinning through a pair of 64-bit hashes (probably as ulongs, like the index), and checking how many bits differ between them. I'm going to use 4-bit values for example purposes, but the principle is the same.


 if ((hash1 & index) != (hash2 & index))

The & operator is doing a bitwise-AND operation. When the hash is ANDed with the index value, you get either 0 or the index value back, depending on whether that specific bit was 0 or 1. (1010 & 0010 == 0010 and 1010 & 0100 == 0000).
If both ANDs produce a 0, or both produce the index value, then the two bits of the hash match. Otherwise, they don't, and we distance++; to indicate that they are off by one more bit than we knew of before.

index <<= 1;

This line merely bumps the index digit to the next bit. It does this by taking the old index (which starts as 1, equal to 0001), and left shifting by one place (<< 1), then setting that back into the index variable (<<= instead of <<). So after the first loop, index will be 0010, then 0100, and so on.

This has the effect of multiplying by 2, but that's not its intended use here.


So overall, you'd get a distance of 2 by running 0011 and 1111 through this algorithm, because two bits are different.

Bobson
  • 13,498
  • 5
  • 55
  • 80
  • 1
    Thank you very much Bobson! I appreciate the in-depth answer! It's been many moons since I have last done any bit shifting. In fact, I've only ever done bit shifting in college classrooms, and not in the actual real world. – MrB Aug 10 '15 at 04:08
  • @MrB - I can't recall a time that I've actually used it in the real world either, except when implementing someone else's algorithm which included it (like this hash-distance checker). Glad it helped. – Bobson Aug 10 '15 at 04:13
  • Interesting to note that you can do the same thing with `result = index ^ hash;`, and then counting the set bits using one of the methods described on the [bithacks page](https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel). – Jim Mischel Aug 10 '15 at 13:18
  • @JimMischel - I'd had the thought of using the XOR of the hashes, which is much simpler than iterating over each bit, but then I realized you still needed to do the bit counting on the result. It becomes a speed/complexity tradeoff. – Bobson Aug 10 '15 at 16:00
  • Yes, the bithacks counting methods are somewhat more complex. Only necessary if speed is a primary consideration. – Jim Mischel Aug 10 '15 at 16:19
4

The code

index <<= 1;

Is a left shift by one bit. It has the same effect in this case as multiplying by two. But see comments for cautions.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • 1
    just for clarity, when written in long hand is index = index << 1; – Code Uniquely Aug 10 '15 at 03:11
  • 4
    It is **not** the same thing as multiplying by two. That's a dangerous simplification. It's similar, and it can be used interchangeably, but only if you can guarantee that `index` is *always* less than (2^63). If it's 2^63 or greater (2^31 for `unit`), a real `*2` operation will throw an exception (assuming you don't use `unchecked`), while the `<<= 1` will silently return the wrong number. – Euro Micelli Aug 10 '15 at 03:29