Borrowing from my answer to a similar question, and having a quick look at Python documentation to try to guess valid syntax...
The code you posted is OK but it's probably subject to being computed in longer precision than is optimal, and it involves a division which also makes things slow.
To make it faster, you can fix c
at a power of two, and you can use binary &
(and) instead of modulo, which gives you this:
h(x) = (a * x + b) & ((1 << 32) - 1)
which is the same as:
h(x) = (a * x + b) & (4294967296 - 1)
which is the same as:
h(x) = (a * x + b) % 4294967296
and you must ensure that a
is an odd number (this is all that's needed to make it co-prime with c
when c
is a power of two). This example limits the output range to a 32-bit integer. You can change that as you see fit. I don't know what Python's limits are.
If you want more parameterisation than that, or you discover that the results aren't "random" enough (it would fail statistical tests very quickly, but that usually doesn't matter), then you can add more operations; but you can't add more of those operations because a chain of adds and multiplies will always simplify to just one pair of add and multiply, so the extra operations wouldn't fix anything.
What you can do instead is to use bit shifts and exclusive-or to break up the linearity; like so:
def h(x):
x = x ^ (x >> 16)
x = (a * x + b) & ((1 << 32) - 1)
x = x ^ (x >> 16)
x = (c * x + d) & ((1 << 32) - 1)
x = x ^ (x >> 16)
return x
You can experiment with variations on that if you want. If you set b
and d
to zero and change the middle 16
to 13
then you get the MurmurHash3 finaliser construction, which is near enough to ideal for most purposes provided you pick good a
and c
(sadly they can't just be random).