I'm working at cs50
speller.
This is my hash function. It works well. But I think it's not really good because there's a lot of if
condition.
I try to make a different table for every word that have apostrophe at the first two letter (ex, A', B', C').
So I used buckets (26 x 26 x 26) + 26 = 17602;
unsigned int hash(const char *word)
{
/*
this is HASHING algorithms in which we return value with checking first three letter
make sure every word set to lowercase so we can easily convert as integer value or alphabetical index
and also set every first element before first letter to be a place for every apostrophe that comes at second word
[A'=0] [Aaa=1] [Aab=2] ... [Aay=25] [Aaz=26] ... [B'=677] [Baa=678] --- base 676 number
n = ( 677*first(word ) 26*second(word) + third(word) ) + 1;
*/
int hash_value = 0;
if(strlen(word) <= 1) // if there's only one word calculate that word, store to element after apostrophe
{
hash_value = ((677 * (tolower(word[0]) - 97)) + 1);
return hash_value;
}
else if(!isalpha(word[1])) // if second word contain apostrophe just calucalte first word
{
hash_value = (677 * (tolower(word[0]) - 97));
return hash_value;
}
else if(strlen(word) <= 2) // if there's two word calculate that two word, store to element after apostrophe
{
hash_value = ((677 * (tolower(word[0]) - 97)) + (27 * (tolower(word[1]) - 97))) + 1;
return hash_value;
}
else if(!isalpha(word[2])) // if third word contain apostrophe just calucalte first and two word
{
hash_value = ((677 * (tolower(word[0]) - 97)) + (27 * (tolower(word[1]) - 97))) + 1;
return hash_value;
}
else
{
hash_value = ((677 * (tolower(word[0]) - 97)) + (27 * (tolower(word[1]) - 97)) + (tolower(word[2]) - 97)) + 1;
return hash_value;
}
}
It's actually works quite well.
./speller texts/lalaland.txt
WORDS MISSPELLED: 955
WORDS IN DICTIONARY: 143091
WORDS IN TEXT: 17756
TIME IN load: 0.02
TIME IN check: 0.02
TIME IN size: 0.00
TIME IN unload: 0.00
TIME IN TOTAL: 0.05
I just don't like it the way I used a lot of ELSE..IF
condition.
So maybe you want to help me with better code (with take first three letters).