0

I'm working on CS50's pset 5, speller. I need a hash function for a hash table that will efficiently store all of the words on the dictionary (~140,000). I found this one online, but I don't understand how it works. I don't know what << or ^ mean. Here is the hash function, thank you! (I would really appreciate it if you could help me :))

int hash_it(char* needs_hashing)
{
    unsigned int hash = 0;
    for (int i=0, n=strlen(needs_hashing); i<n; i++)
        hash = (hash << 2) ^ needs_hashing[i];
    return hash % HASHTABLE_SIZE;
}
Nicolas F
  • 505
  • 6
  • 17
  • `<<` and `^` are basic C operators that will be covered in any C book or tutorial. It would be best if you consulted those resources for such language fundamentals or do some basic research with your favourite search engine.. – kaylum Jun 27 '20 at 01:37
  • This is not a good hash function. For this application I would suggest CityHash or SipHash. – zwol Jun 27 '20 at 01:52

2 Answers2

0

Those two are Bit-wise operators. These are easy to learn and must to learn for a programmer.

<< - is a binary left shift operator.

Suppose variable "hash" binary is "0011".

hash << 2 becomes "1100".

And ^ is XOR operator. (If set in only one operand ...not in both)

Suppose in your code

hash << 2 gives "1100"

needs_hashing[1] gives "1111"

then

(hash << 2) ^ needs_hashing[i] gives "0011"

For a quick understanding bitwise operators, quickly walk here https://www.tutorialspoint.com/cprogramming/c_bitwise_operators.htm

Pavan Chandaka
  • 11,671
  • 5
  • 26
  • 34
0

In the original topic, is demonstrated very inefficient hash function. Two lowest bits of hash after calculation equals to two lowest bits of last char within input line needs_hashing. As result, for example, if all strings contains even ascii-code of last char, then all your hashes also would be even, if HASHTABLE_SIZE is even (2^n, or so).

More efficient hash, based on cyclic shift:

uint32_t hash_it(const char *p) {
  uint32_t h = 0xDeadBeef;
  while(char c = *p++)
    h = ((h << 5) | (h >> (32 - 5))) + c;
  h ^= h >> 16;
  return h % HASHTABLE_SIZE;
}
olegarch
  • 3,670
  • 1
  • 20
  • 19