I have a set of bit patterns, and want to find the index of the element in the set which matches a given input. The bit pattern contains "don't care" bits, that is x-es which matches both 0 and 1.
Example The set of bit patterns are
index abcd
0 00x1
1 01xx
2 100x
3 1010
4 1x11
Then, trying to match 0110 should return index 1 and 1011 should return index 4.
How can this problem be solved faster than a linear search through the elements? I guess a kind of binary tree could be made, but then, what is a smart way of creating such a tree? Are there other efficient data structures/algorithms for such a problem, primarily in terms of query efficiency both also storage requirements.
- The bit patterns will be 64 bits (or more)
- The number of elements in the set will be in the order 10^5 - 10^7
- Not all bit combinations are represented in the set, e.g in the example 0000 is not represented
- There will be a high number of x-es in the data set
- A bit string will match only one of the elements in the set
I have two different cases in which I need to solve this problem
- Case 1: I have the possibility of doing a lot of precomputing
- Case 2: New elements will be added to the set on the fly
Update The x-es are more likely to show up in some bit positions than others, that is, some bit positions will be dominated by x-es while others will be mainly zeroes/ones.