3

In the example of defining a custom hash function on page 114 of Nim in Action, the !$ operator is used to "finalize the computed hash".

import tables, hashes
type
  Dog = object
    name: string

proc hash(x: Dog): Hash = 
  result = x.name.hash
  result = !$result

var dogOwners = initTable[Dog, string]()
dogOwners[Dog(name: "Charlie")] = "John"

And in the paragraph below:

The !$ operator finalizes the computed hash, which is necessary when writing a custom hash procedure. The use of the $! operator ensures that the computed hash is unique.

I am having trouble understanding this. What does it mean to "finalize" something? And what does it mean to ensure that something is unique in this context?

Imran
  • 12,950
  • 8
  • 64
  • 79

1 Answers1

5

Your questions might become answered if instead of reading the single description of the !$ operator you take a look at the beginning of the hashes module documentation. As you can see there, primitive data types have a hash() proc which returns their own hash. But if you have a complex object with many variables, you might want to create a single hash for the object itself, and how do you do that? Without going into hash theory, and treating hashes like black boxes, you need to use two kind of procs to produce a valid hash: the addition/concatenation operator and the finalization operator. So you end up using !& to keep adding (or mixing) individual hashes into a temporal value, and then use !$ to finalize that temporal value into a final hash. The Nim in Action example might have been easier to understand if the Dog object had more than a single variable, thus requiring the use of both operators:

import tables, hashes, sequtils
type
  Dog = object
    name: string
    age: int

proc hash(x: Dog): Hash =
  result = x.name.hash !& x.age.hash
  result = !$result

var dogOwners = initTable[Dog, string]()
dogOwners[Dog(name: "Charlie", age: 2)] = "John"
dogOwners[Dog(name: "Charlie", age: 5)] = "Martha"
echo toSeq(dogOwners.keys)
for key, value in dogOwners:
  echo "Key ", key.hash, " for ", key, " points at ", value

As for why are hash values temporarily concatenated and then finalized, that depends much on which algorithms have the Nim developers chosen to use for hashing. You can see from the source code that hash concatenation and finalization is mostly bit shifting. Unfortunately the source code doesn't explain or point at any other reference to understand why is that done and why this specific hashing algorithm was selected compared to others. You could try asking the Nim forums for that, and maybe improve the documentation/source code with your findings.

Grzegorz Adam Hankiewicz
  • 7,349
  • 1
  • 36
  • 78
  • 1
    I see - we are not providing a complete implementation of the hash, rather we are choosing some raw values and passing them on to Nim's internal procedure for "finalizing" the hash value. I do hope to contribute to Nim eventually, but it might be a bit early right now. – Imran Jan 10 '18 at 16:31
  • 2
    The operators are the combine/mix and postprocess steps of [Bob Jenkins's one-at-a-time hash function](https://en.wikipedia.org/wiki/Jenkins_hash_function#one-at-a-time). – Reimer Behrends Jan 10 '18 at 19:45