I work on GPL'ed C++ code with heavy data processing. One particular pattern we often have is to collect some amount (thousands to millions) of keys or key/value pairs (usually int32..int128), insert them into hashset/hashmap and then use it without further modifications.
I named it immutable hashtable, although single-assignment hashtable may be even a better name since we don't use it prior to full construction.
Today we are using STL unordered_map/set, but we are looking for a better (especially faster) library. Can you recommend anything suitable for the situation, with GPL-compatible license?
I think that the most efficient approach would be to radix-sort all keys by the bucket num and provide bucket->range mapping, so we can use the following code to search for a key:
bool contains (set,key) {
h = hash(key);
b = h % BUCKETS;
for (i : range(set.bucket[b], set.bucket[b+1]-1)
if (set.keys[i]==key) return true;
return false;
}
Your comments on this approach? Can you propose a faster way to implement immutable map/set?