0

I'm new to Hash Maps and I have an assignment due tomorrow. I implemented everything and it all worked out fine, except for when I get a collision. I cant quite understand the idea of linear probing, I did try to implement it based on what I understood, but the program stopped working for table size < 157, for some reason.

void hashEntry(string key, string value, entry HashTable[], int p) 
    {
        key_de = key;
        val_en = value;
       for (int i = 0; i < sizeof(HashTable); i++)
        {
        HashTable[Hash(key, p) + i].key_de = value;
        }
    }

I thought that by adding a number each time to the hash function, 2 buckets would never get the same Hash index. But that didn't work.

Omar Baz
  • 15
  • 1
  • 4
  • 2
    I will guarantee `sizeof(HashTable)` is wrong, so start with that. `i < sizeof(HashTable)` is the size of a *pointer*. And is `p` the current table max ? – WhozCraig Dec 15 '19 at 12:42
  • The basic principle of hashing is to map a larger set to a smaller set. This means that there is always the possibility that multiple values of the larger set are mapped to the same value of smaller set -> a collision. The quality of hash function is merely reducing the probability of this for the expected average real data but it cannot prevent this. Hence, your hash table has to be prepared for this. A simple solution is to store some kind of container for the hash table entries, e.g. a list. (In case of collision, you have to test every value of this list.) Another solution is rehashing. – Scheff's Cat Dec 15 '19 at 12:42
  • @WhozCraig yes `p` is the table max. – Omar Baz Dec 15 '19 at 12:44
  • FYI: Wikipedia about [Collision Resolution](https://en.wikipedia.org/wiki/Hash_table#Collision_resolution). – Scheff's Cat Dec 15 '19 at 12:44
  • _I thought that by adding a number each time to the hash function, 2 buckets would never get the same Hash index._ IMHO, 2 buckets shouldn't never had the same hash index (except that this resulted from the specific collision resolution which is used). – Scheff's Cat Dec 15 '19 at 12:52

1 Answers1

1

A hash table with linear probing requires you

  1. Initiate a linear search starting at the hashed-to location for an empty slot in which to store your key+value.
  2. If the slot encountered is empty, store your key+value; you're done.
  3. Otherwise, if they keys match, replace the value; you're done.
  4. Otherwise, move to the next slot, hunting for any empty or key-matching slot, at which point (2) or (3) transpires.
  5. To prevent overrun, the loop doing all of this wraps modulo the table size.
  6. If you run all the way back to the original hashed-to location and still have no empty slot or matching-key overwrite, your table is completely populated (100% load) and you cannot insert more key+value pairs.

That's it. In practice it looks something like this:

bool hashEntry(string key, string value, entry HashTable[], int p)
{
    bool inserted = false;
    int hval = Hash(key, p);

    for (int i = 0; !inserted && i < p; i++)
    {
        if (HashTable[(hval + i) % p].key_de.empty())
        {
            HashTable[(hval + i) % p].key_de = key;
        }

        if (HashTable[(hval + i) % p].key_de == key)
        {
            HashTable[(hval + i) % p].val_en = value;
            inserted = true;
        }
    }

    return inserted;
}

Note that expanding the table in a linear-probing hash algorithm is tedious. I suspect that will be forthcoming in your studies. Eventually you need to track how many slots are taken so when the table exceeds a specified load factor (say, 80%), you expand the table, rehashing all entries on the new p size, which will change where they all end up residing.

Anyway, hope it makes sense.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141