As user stark points out (more rudely than necessary, I think) in comments, this is just long division in base 2^32. Or in base 2, with more digits. And you can use the algorithm for long division that you learned in school for base 10.
There are fancier algorithms for doing division on much bigger numbers, but for your application I suspect you can afford to do it very simply and inefficiently, along the following lines (this is the base-2 version, which just amounts to subtracting off left-shifted versions of the denominator until you can no longer do so):
// Treat x,y as 160-bit numbers, where [0] is least significant and
// [4] is most significant. Compute x mod y, and put the result in out.
void mod160(unsigned int out[5], const unsigned int x[5], const unsigned y[5]) {
unsigned int temp[5];
copy160(out, x);
copy160(temp, y);
int n=0;
// Find first 1-bit in quotient.
while (lessthanhalf160(temp, x)) { lshift160(temp); ++n; }
while (n>=0) {
if (!less160(out, temp)) sub160(out, temp); // quotient bit is 1
rshift160(temp); --n; // proceed to next bit of quotient
}
}
For the avoidance of doubt, the above is only a crude sketch:
- It may be full of bugs.
- I haven't written the implementations of the building-block functions like
less160
.
- Actually you'd probably just put the code for those inline rather than having separate functions. E.g.,
copy160
is just five assignments, or a short loop, or a memcpy
.
It could surely be made more efficient; e.g., that first step might do better to count leading 0-bits and then do a single shift, instead of shifting one place at a time. (The right-shifting probably doesn't want to do that, though, because half the time you will be doing only a single shift.)
The "base-2^32" version may well be faster, but the implementation will be a bit more complicated.