What are some ways to prevent deliberate malicious attacks against hash function implementations?

Question

Say you have some software server that uses hash functions and some external source wants to exploit that and it keeps attacking the server using keys that they know (or with high probability) will result in collisions. How would you prevent this in practice?

I think one way is to choose the hash function randomly at the beginning of the problem, but this method seems slow in the sense that every time you change hash functions you have to rehash everything.

Please [edit] your question to include a specific problem you have with the selected hash function and the issue you see with that function. "Usually" you don't get hash collisions, so what is the problem or attack which is affecting your code? Clarify your question with these additional information. — Progman, Aug 24 '21 at 15:57
@Progman In the OP, I stated that it's for "some external source that wants to exploit that," and by "that," I mean "hash collisions." It's some malicious source that's deliberately trying to create these collisions. — user5965026, Aug 24 '21 at 16:57
Depending on your code they will not succeed as you might not be able to just generate hash collisions. Please [edit] your question to include a detailed description of the problem you have or the issue you have with the hash function you are using. — Progman, Aug 24 '21 at 17:49

score 0 · Answer 1 · answered Aug 25 '21 at 14:17

As you obviously realise, the best defence is to make sure they don't know what your hash function will produce - ideally not your bucket count either (if the hash function is strong, hard to reverse and produces a large range of outputs - such as say 64-bit unsigned integers - then finding two keys that produce the same hash may be time consuming, but finding a value that will hash to a specific bucket after modding by N only needs on average N attempts with any random, distinct keys).

choose the hash function randomly at the beginning of the problem, but this method seems slow in the sense that every time you change hash functions you have to rehash everything.

There's not necessarily a need to repeatedly change the hash function... you just need to make it unguessable based on exposed data/code and observable behaviours. For example, you might generate a random seed value on your server, write that to a secure file somewhere, and use it as a seed for your hash function (or if your hash function doesn't support a seed value, just XOR the hash output with the random value). Even if someone knows your hash function, if they don't know the seed then they can't engineer collisions.

You could also count the collisions a particular client has had, and if it's obviously malicious - disconnect them and remove their keys.

What are some ways to prevent deliberate malicious attacks against hash function implementations?

1 Answers1