Why does Data.HashTable use hashing with salt (from Data.Hashable)?

Question

I do not understand why Data.HashTable is using Data.Hashable , which has hashWithSalt as the (only/basic) method.

This does not fit with the natural optimization of computing the hash value once, and storing it in the object (natural, because Haskell objects are immutable).

If I want to use HashTables with that, then I'm forced to implement hashWithSalt. (Going 1.2.0.* to 1.2.1.*, hashable re-introduced hash as a class method, but this does not help?)

The actual Table implementations don't seem to make use of hashWithSalt (HashTable.ST.Linear does not at all, HashTable.ST.Cuckoo uses two fixed salts only).

What package are you looking at? http://hackage.haskell.org/package/base-4.5.1.0/docs/Data-HashTable.html doesn't use `Hashable` at all. — dfeuer, May 24 '14 at 16:35
Can't you implement `hashWithSalt` in terms of `hash`? The cuckoo version might not work but the other hashtables will. — Daniel, May 24 '14 at 16:44
The reason hash tables use a hash with a salt is to mitigate hash collision DoS attacks when an attacker can control keys inserted into the table. Of course, they should use site-specific salts instead of salts fixed by the library.. — Carl, May 24 '14 at 18:11
@dfeuer: `Data.HashTable` was in base-4.6, but became separate with base-4.7 I am looking at http://hackage.haskell.org/package/hashtables-1.1.2.1 — d8d0d65b3f7cf42, May 24 '14 at 18:43
@Carl: my point is that this (cryptographic) reasoning does not apply for use of hashing in data structures. Or does it? — d8d0d65b3f7cf42, May 24 '14 at 18:44
@d8d0d65b3f7cf42 It absolutely does. Any time you're running a service that interacts with the outside world, you need to be aware of potential attacks. A well-known (or so I thought) attack is a collision attack on hash tables used to store key provided by attackers. It's a denial of service attack that overloads a process by forcing every key into the same hask bucket, resulting in linear-time checks for the key existing already on every insert. — Carl, May 24 '14 at 20:45
That attack results in O(n) requests from the attacker doing O(n^2) work in-process. It doesn't take that long for an attacker to force a service to grind to a halt. This attack works against web servers particularly well, and was demonstrated in practice several times. The key part is the attacker being able to predict hash collisions. If you salt the hash function with something site-specific, that's no longer a threat. — Carl, May 24 '14 at 20:48
@Daniel Velkov: yes, something like `hashWithSalt s x = hashWithSalt s $ hash x`. If there are enough bits in `hash x`, it should be fine. I think this is `defaultHashWithSalt` which is in the source ( https://hackage.haskell.org/package/hashable-1.2.2.0/docs/src/Data-Hashable-Class.html#Hashable ) but not exported. — d8d0d65b3f7cf42, May 25 '14 at 13:08
@Carl: I'm still not buying it. Where's the attacker in this application: the cache for a BDD base (cf. http://sourceforge.net/p/buddy/gitcode/ci/master/tree/src/cache.h) — d8d0d65b3f7cf42, May 25 '14 at 13:10
Whether you buy it or not is irrelevant to whether applications have been vulnerable to it in the past. https://isc.sans.edu/diary/Hash+collisions+vulnerability+in+web+servers/12286 — Carl, May 25 '14 at 14:35

score 3 · Answer 1 · answered Feb 23 '15 at 05:50

As Carl notes in the comments, the move to the hashWithSalt method over just hash (as the original Hashable used) was to allow people to mitigate DOS attacks based on hash collisions. For a period, a different random default salt was generated on every run, even, using unsafePerformIO in the background. This lack of reproducibility turned out to be a huge problem, however, for people interested in e.g. persisting data structures across runs, getting reliable benchmarking numbers, etc.

So, the current approach is to provide the method, but tend to defer to a default salt that is fixed, and then add a warning to the documentation that this remains susceptible to various potential DOS attack vectors if used in a public-facing ways. (You can see for yourself in the documentation here: http://hackage.haskell.org/package/hashable-1.2.1.0/docs/Data-Hashable.html)

Because hash is its own class method, it is easy enough to implement an object with a "saltless" hash that is memoed with it, and furthermore, you can implement hashWithSalt as just xoring with the salt if you like. Or, as the comments note, you can implement hashWithSalt via a more legitimate method of hashing your generated/memoed hash.

Why does Data.HashTable use hashing with salt (from Data.Hashable)?

1 Answers1

Linked