5

I don't know the actual mathematical term (many to one mapping is the terminology i've used)

This is my requirement:

hash_code = hash_function(element 1, element 2, ...... element n)

i should be able to retrieve

bool b = is_valid_hash(hash_code, element x)

the function is_valid_hash should be able to tell me weather 'element x' was an element passed in the hash_function

What is the name to such hash functions? One hash should be able to map to multiple elements (not collision).

PC.
  • 6,870
  • 5
  • 36
  • 71
  • This is great question! Any solution would likely involve [homomorphic encryption](http://en.wikipedia.org/wiki/Homomorphic_encryption) of some sort, so if there's no OOTB implementation of this, this might be better migrated over to crypto.SE. – pdubs Dec 28 '11 at 15:20
  • Do you need a true hash function that is not reversible, and is secure? – Warren P Dec 28 '11 at 15:21
  • 2
    I don't know *how* they are called, but I'd call them "set enumeration hash functions". Taking a Prime number for every element and multiplying them (how many possible elements are there?) seems logical. (having a special private hash value for every possible element is called *Zobrist* hashing, BTW) – wildplasser Dec 28 '11 at 15:24
  • 1
    @Warren: i do not need true hash. hash function can map to more than n elements(but the collision should not be very high). also there's no restriction in one-way or two-way hashing. security is not the issue. – PC. Dec 28 '11 at 15:39
  • 1
    @Wildpasser: there can be upto 1000 elements(even more), each of size 16 bytes. i did not understand ur idea of multiplying with prime numbers. my functions `hash_function` and `is_valid_hash` are supposed to be on different machines, so i cannot maintain common hash-table – PC. Dec 28 '11 at 15:44
  • The prime number thing would only be possible for a small number of items from a small domain. The idea is to assign a (unique) prime to every member of the domain. These numbers multiplied make it possible to test if an item was present by testing if the multiple is divisible by that number's prime. But for 1000 items, the multiple will be way too big. – wildplasser Dec 28 '11 at 16:07

2 Answers2

2

what i was looking for is : Bloom Filter

PC.
  • 6,870
  • 5
  • 36
  • 71
0

Assuming that hash_function is a standard hashing algorithm (md5, etc) this can't be done. However, if it's a custom function you could do it in one of two ways:

  1. hash_function() could hash each element and then concatenate the strings (this would produce a very long hash, and it would be less secure in some ways, but it would work), and then you could do a sub-string compare on is_valid_hash() (see if the hashed element x is a substring of hash_code.

  2. Similarly, hash_function could return an array of hashes... if you need a string or security is a concern, you could also return a 2-way encrypted serialized array... this could then be decrypted and unserialized in is_valid_hash() and you could check if the element x hash is in the array.

Ben D
  • 14,321
  • 3
  • 45
  • 59