2

What are some good hash functions that can be used for implementing Rabin-Karp string search algorithm? I only know of polynomial hash, but it has some flaws — most notably, if hashing is done modulo 264, there is a test which is guaranteed to produce collisions very often (and using another modulus is impractical because the mod operation is very expensive). So, is there a fast, easy-to-write good hash function?

P.S. I know about buzhash, but I am wondering if there are other alternatives…

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
Emily
  • 2,577
  • 18
  • 38
  • mod (%) is not expensive. It *used* to be expensive in the 1980's. Question: why should the hash function be *fast*? – wildplasser Oct 04 '12 at 21:45
  • BTW: modulo (1<<64) is certainly not expensive. If you use 64 bits *unsigned* types it is *tout gratuit*. – wildplasser Oct 04 '12 at 21:54
  • @wildplasser his comment about particular tests seem to imply this question is being asked from a programming competition standpoint, in which modulo 1<<64 is impractical: http://codeforces.com/blog/entry/4898 – ffao Oct 05 '12 at 02:07
  • @ffao You’re right. In fact, I am asking *precisely* because of the post you’ve linked :) – Emily Oct 05 '12 at 21:03

1 Answers1

2

Since it's not a security hash and you just need a "good" fingerprint, I would suggest something like Tabulation hashing. The hole operation will be about multiple fold faster than mod operation.

Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
iampat
  • 1,072
  • 1
  • 12
  • 23