Consistent hashing SHA1 modulo operation

Question

I hope some guru here could help me out

I am writing a C/C++ code to implement Consistent Hashing using SHA1 as the hashing algorithm.

I need to implement module operation as follow :

0100 0011....0110 100 mod 0010 1001 = ?

If the divisor (0000 1111) is a power of 2 pow(2,n) , then it would be easy as the last n bit of dividend is the result.

SHA1 length is 160 bit ( or 40 hexa ). My question is how do I implement a modulo operation of a long bit string to another arbitrary bit string?

Thank you.

Is it actually modulo an arbitrary number, or is it always modulo something of the form 0...01...1? — Gareth McCaughan, Apr 13 '16 at 11:41
@stark : pow(2,160) ~ 1e48. I don't think long int can cover that. — tmh1999, Apr 13 '16 at 11:48
... And is the modulus a full 160 bits long, or will it always be shorter? — Gareth McCaughan, Apr 13 '16 at 11:49
I don't believe stark was suggesting that everything will fit in a long int. — Gareth McCaughan, Apr 13 '16 at 11:49
@Gareth : it's an arbitrary number. And the dividend is 160 bits long, the divisor is smaller than 160 bits long. @ stark: Could you kindly explain for me about long division you are talking about ? I am kind of lost here. I just dont think that 160 bit binary hash can be represented by a long int number. — tmh1999, Apr 13 '16 at 11:57
Long division is the procedure you probably learned in school for dividing arbitrary-length (decimal) numbers by one another. You can do it in other bases too; for this application you can do it on 160-digit binary numbers or on 5-digit base-2^32 numbers. — Gareth McCaughan, Apr 13 '16 at 12:00
Will the divisor always fit in, say, a 32-bit unsigned integer? That would enable the code for the "base 2^32" version to be simpler and quicker. — Gareth McCaughan, Apr 13 '16 at 12:01

score 1 · Accepted Answer · answered Apr 13 '16 at 11:59

As user stark points out (more rudely than necessary, I think) in comments, this is just long division in base 2^32. Or in base 2, with more digits. And you can use the algorithm for long division that you learned in school for base 10.

There are fancier algorithms for doing division on much bigger numbers, but for your application I suspect you can afford to do it very simply and inefficiently, along the following lines (this is the base-2 version, which just amounts to subtracting off left-shifted versions of the denominator until you can no longer do so):

// Treat x,y as 160-bit numbers, where [0] is least significant and
// [4] is most significant. Compute x mod y, and put the result in out.
void mod160(unsigned int out[5], const unsigned int x[5], const unsigned y[5]) {
  unsigned int temp[5];
  copy160(out, x);
  copy160(temp, y);
  int n=0;
  // Find first 1-bit in quotient.
  while (lessthanhalf160(temp, x)) { lshift160(temp); ++n; }
  while (n>=0) {
    if (!less160(out, temp)) sub160(out, temp); // quotient bit is 1
    rshift160(temp); --n; // proceed to next bit of quotient
  }
}

For the avoidance of doubt, the above is only a crude sketch:

It may be full of bugs.
I haven't written the implementations of the building-block functions like less160.
Actually you'd probably just put the code for those inline rather than having separate functions. E.g., copy160 is just five assignments, or a short loop, or a memcpy.

It could surely be made more efficient; e.g., that first step might do better to count leading 0-bits and then do a single shift, instead of shifting one place at a time. (The right-shifting probably doesn't want to do that, though, because half the time you will be doing only a single shift.)

The "base-2^32" version may well be faster, but the implementation will be a bit more complicated.

Consistent hashing SHA1 modulo operation

1 Answers1