im simulating set intersection approximation using bloom filters. i have tried a lot of simple hash functions to hash the values to the filter. but its not good at avoiding collisions. so somebody suggested a universal hash function. but im not sure of how it works. my program is designed to pass just the key to the hash function and the hash function returns the hash. can anyone help me with the code? thanks
Asked
Active
Viewed 1,160 times
-2
-
What, specifically, is the problem? – Oliver Charlesworth Feb 11 '12 at 21:52
-
1You are very much on the wrong track. If you had a perfect universal hashing function then using a bloom filter would be pointless. They are useful if you have *imperfect* ones. And un-universal ones, it requires a set of hashing functions. – Hans Passant Feb 11 '12 at 22:07
1 Answers
0
don't worry about collision of hash functions when used with bloom filters. you don't have to handle collision in this case. just get k different has functions which set k bits in an array of m-bits when you are inserting an element. at the time of query, you again use all k hash functions to check all the k-bits; if any one of them is not set then the search is false. if all of them is set, you can't conclude anything (false positive results). This is clearly explained in wiki:

cforfun
- 135
- 1
- 11