Each bit is independent, so in a preprocessing phase[*] you could classify each entry 32 (or however big your int
is) times. Each classification stores 2 sets: those which match at that bit when key
is 0 and those which match when key
is 1.
That is, if value == 1 and mask == 0 at that bit, then that classification doesn't store that entry at all, since it doesn't match any value of key
(in fact, no matter what scheme you use, such entries should be removed during any preprocessing stage, so no classification should store an entry if even one bit is like this). If both 0, store into both sets. Otherwise store into one of the two sets.
Then, given your key, you want to find a fast intersection of 32 sets.
Depending on the size of the original array, it may be that the best way to store each set is a giant bit array indicating whether each entry in the array is in the set or not. Then finding the intersection can be done a word at a time - &
together 32 words, one from each bit array. If the result is 0, keep going. If the result is non-0, you have a match, and the bit that's set in the result tells you which entry is the match. This is still linear in the size of the array, of course, and in fact you're doing 31 &
operations to check 32 entries for a match, which is about the same as the simple linear search through the original array. But there's less comparison and branching, and the data you're looking at is more compressed, so you might get better performance.
Or there might be a better way to do the intersection.
If keys tend to be re-used then you should cache the results of the lookup in a map from keys to entries. If the number of possible keys is reasonably small (that is, if significantly less than 2^32 keys are possible inputs, and/or you have a lot of memory available), then your preprocessing phase could just be:
- take each entry in turn
- work out which possible keys it matches
- add it to the map for those keys
[*] Without any preprocessing, obviously all you can do is check every array member until either you find a match or else you've checked everything.
It will be enough to find the first entry that applies – Serge C Jul 05 '11 at 14:07