I am having trouble adapting the pending C++1z proposal N3980 by @HowardHinnant to work with tabulation hashing.
Ab initio computing a tabulation hash works the same as for the hashing algorithm (Spooky, Murmur, etc.) described in N3980. It is not that complicated: just serialize an object of any user-defined type through hash_append() and let the hash function update a pointer into a table of random numbers as you go along.
The trouble starts when trying to implement one of the nice properties of tabulation hashing: very cheap to compute incremental updates to the hash if an object is mutated. For "hand-made" tabulation hashes, one just recomputes the hash of the object's affected bytes.
My question is: how to communicate incremental updates to a uhash<MyTabulationAlgorithm>
function object while keeping true to the central theme of N3980 (Types don't know #)?
To illustrate the design difficulties: say I have a user-defined type X with N data members xi of various types Ti
struct X
{
T1 x1;
...
TN xN;
};
Now create an object and compute its hash
X x { ... }; // initialize
std::size_t h = uhash<MyTabulationAlgorithm>(x);
Update a single member, and recompute the hash
x.x2 = 42;
h ^= ...; // ?? I want to avoid calling uhash<>() again
I could compute the incremental update as something like
h ^= hash_update(x.x2, start, stop);
where [start, stop)
represents the range of the table of random numbers that correspond to the x2 data member. However, in order to incrementally (i.e. cheaply!) update the hash for arbitrary mutations, every data member needs to somehow know its own subrange in the serialized byte stream of its containing class. This doesn't feel like the spirit of N3980. E.g., adding new data members to the containing class, would change the class layout and therefore the offsets in the serialized byte stream.
Application: tabulation hashing is very old, and it has recently been shown that it has very nice mathematical properties (see the Wikipedia link). It's also very popular in the board game programming community (computer chess and go e.g.) where it goes under the name of Zobrist hashing. There, a board position plays the role of X, and a move the role of a small update (move a piece from its source to its destination e.g.). It would be nice if N3980 could not only be adapted to such tabulation hashing, but that it can also accomodate the cheap incremental updates.