0

I need to read a file and then store each word to the hash table with linked list collision handling, and count how many times each word appeared (node's value). When I run my code with small text (like 30 lines) it works, but starting from about 100 it crashes (Segmentation fault: 11). I know my hashCode function is bad but it should not crash. I think the problem is how I increment the value.

using namespace std;

class HashNode
{
public:
    string key;
    string value;

public:
    HashNode(string key, int value)
    {
        this->key = key;
        this->value = value;
    }
    friend class HashTable;
};

class HashTable {
    private:
        list<HashNode> *buckets; 
        int size;               
        int capacity;           
        int collisions;

    public:
        HashTable(int capacity){
            buckets = new list<HashNode>[capacity];
            this->capacity = capacity;
            this->size = 0;
            this->collisions = 0;
        }
        ~HashTable()
        {
            for (int i = 0; i < capacity; i++)
                buckets[i].clear();

            delete[] this->buckets;
        }
        int hashCode(string key)
        {
            int sum = 0;
            for (int k = 0; k < key.length(); k++)
                sum = sum + int(key[k]);
            return sum % capacity;
        }
        void insert(string key)
        {
            int value=0;
            int index = hashCode(key) % this->capacity; 

            for (list<HashNode>::iterator it = buckets[index].begin(); it != buckets[index].end(); ++it)
                if (it->key == key)
                {
                    it->value+=1; 
                    return;
                }
            if (buckets[index].size() > 0)
                collisions++;

            buckets[index].push_back(HashNode(key, value)); 
            this->size++;                                   
        }

        int getCollisions()
        {
            return this->collisions;
        }

};

int main() {
    string user_input;
    string word;
    ifstream inFile;
    string parameter;
    string command;
    HashTable object(80000);
    inFile.open("file.txt");
    cout << "Welcome " << endl;
    if (!inFile)
    {
        cout << "Unable to open the file";
        exit(1); 
    }
    listOfCommand();
    while (inFile >> word)
    {   
        object.insert(word);
    }
}
    

What can cause this crash? Any help will be appreciated!

Marek R
  • 32,568
  • 6
  • 55
  • 140
  • 3
    Did you try to debug your code? It will show the place of crash. Please post a [mcve] if you need help. The problem I see is using a raw pointer to store an array, why not use `vector`? – Quimby Dec 04 '20 at 21:05
  • where do you allocate the list that `list *buckets;` is supposed to point to? – 463035818_is_not_an_ai Dec 04 '20 at 21:06
  • If you want a list of `HashNodes` the member should be `list` (no pointer). If you want a dynamic array of lists of `HashNode` it should be a `std::vector>`. A `list*` rarely makes sense – 463035818_is_not_an_ai Dec 04 '20 at 21:08
  • the code works with around 30 lines, my question is that why it is not working with larger number of words in a text – mike jetski Dec 04 '20 at 21:12
  • 1
    please post a [mcve]. We cannot know what is wrong with code you do not show. The code you posted misses essential details, if they are also not present in your real code, that would explain a lot, but how your real code looks like we cannot know – 463035818_is_not_an_ai Dec 04 '20 at 21:17
  • 1
    also if your code has undefined behavior then "it works for 30 lines" does not help. Appearing to work under some conditions and blowing up when the moon is in a different phase is typical for ub – 463035818_is_not_an_ai Dec 04 '20 at 21:18
  • You are focused on this "30 lines" criterion. Have you rigorously confirmed this hypothesis? Based on how others debug their programs, it is reasonably likely that you might have jumped to a conclusion. – JaMiT Dec 04 '20 at 21:18
  • 1
    Thanks i posted all my code – mike jetski Dec 04 '20 at 21:19

1 Answers1

0

Most likely char is signed in your system, so converting it to integer in line sum = sum + int(key[k]); results in negative value, and then you get segmentation fault when try to get buckets[index] with negative index.

A quick way to fix it would be at first to convert key[k] to unsigned char, and only then to int:

for (int k = 0; k < key.length(); k++) {
    unsigned char c = static_cast<unsigned_char>(key[k]);
    sum = sum + static_cast<int>(c);
}
fdermishin
  • 3,519
  • 3
  • 24
  • 45
  • 1
    Seems a bit roundabout. It's the index that is not allowed to be negative, so wouldn't it make more sense to change `sum` (and `index`, `size`, and `capacity`) from `int` to `unsigned` and not need an additional cast? – JaMiT Dec 05 '20 at 00:49