My question relates a lot to this topic:
Hash function on list independant of order of items in it
Basically, I have a set of N numbers. N is fixed and is typically quite large, eg. 1000 for instance. These numbers can be integers or floating-point. They can be equal, some or all of them. No number can be zero.
Every combination of K numbers where K is anything between 1 and N leads to the calculation of a hash.
Let's take an example with 3 numbers, that I will call A, B and C. I need to calculate a hash for the following combinations:
A
B
C
A+B
B+C
A+B+C
A+C
Things are order-independent, C+A is just equal to A+C. '+' can be a real addition or something different, like a XOR, but it is fixed. Likewise, every value may go through a function first, eg.
f(A)
f(B)
f(A)+f(B)+f(C)
...
Now, I need to avoid collisions, but in a specific way only. Each combination is tagged with a number, either 0 or 1. Collisions may occur such that, if possible, only those tagged with the same number (0 or 1) may collide. In this case many collisions are even welcome indeed, especially if this makes the hash value compact. I mean, ideally, the best hash is only 1 bit long ! (0 or 1). Collisions between combinations tagged with different numbers (0 and 1) should only rarely happen if possible - this is the whole point.
Let's take an example. Combination -> tag -> calculated hash value:
Combination Tag Hash
A -> 0 -> 0
B -> 1 -> 1
C -> 0 -> 2
A+B -> 0 -> 0
B+C -> 1 -> 1
A+B+C -> 1 -> 3
A+C -> 0 -> 2
Here, the hash values are valid because there is no collision between combinations of different tags. A collides with A+B for instance, but they're both tagged '0'.
However, the hash is not very good overall, because I need 4 bits, which seems a lot for only 4 input numbers.
How can find a good (good enough) hash function for this purpose?
Thank you for your insight.