0

Sorry for the wordy title but it explains my question pretty well.

I am working on an assignment in Java where I need to create my own Hash Table.

The specifications are such that I must use an Array, as well as open-addressing for collision handling (with both double hashing and quadratic hashing implementations).

My implementation works quite well, and using over 200,000 randomly generated Strings I end up with only ~1400 Collisions with both types of collision handling mentioned about (keeping my load factor at 0.6 and increasing my Array by 2.1 when it goes over).

Here is where I'm stumped, however... My assignment calls for 2 specifications that I cannot figure out.

1) Have an option which, when removing a value form the table, instead of using "AVAILABLE" (replacing the index in the array with a junk value that indicates it is empty), I must find another value that previously hashed to this index and resulted in a collision. For example, if value A hashed to index 2 and value B also hashed to index 2 (and was later re-hashed to index 5 using my collision handling hash function), then removing value A will actually replace it with Value B.

2) Keep track of the maximum number of collisions in a single array index. I currently keep track of all the collisions, but there's no way for me to keep track of the collisions at an individual cell.

I was able to solve this problem using Separate Chaining by having each Array Index hold a linked list of all values that have hashed to this index, so that only the first one is retrieved when I call my get(value) method, but upon removal I can easily replace it with the next value that hashed to this index. It's also an easy way to get the max number of collisions per index.

But we were specifically told not to use separate chaining... I'm actually wondering if this is even possible without completely ruining the complexity of the hash table.

Any advice would be appreciated.

edit:

Here are some examples to give you an idea of my class structure:

public class daveHash {

//Attributes
public String[] dTable;
private double loadFactor, rehashFactor;
private int size = 0;
private String emptyMarkerScheme;
private String collisionHandlingType;
private int collisionsTotal = 0;
private int collisionsCurrent = 0;

//Constructors
public daveHash()
{
    dTable = new String[17];
    rehashFactor = 2.1;
    loadFactor = 0.6;
    emptyMarkerScheme = "A";
    collisionHandlingType = "D";
}

public daveHash(int size)
{
    dTable = new String[size];
    rehashFactor = 2.1;
    loadFactor = 0.6;
    emptyMarkerScheme = "A";
    collisionHandlingType = "D";
}

My hashing functions:

public long getHashCode(String s, int index)
{
    if (index > s.length() - 1)
        return 0;
    if (index == s.length()-1)
        return (long)s.charAt(index);

    if (s.length() >= 20)
        return ((long)s.charAt(index) + 37 * getHashCode(s, index+3)); 

    return ((long)s.charAt(index) + 37 * getHashCode(s, index+1)); 
}

public int compressHashCode(long hash, int arraySize)
{
    int b = nextPrime(arraySize);
    int index = ((int)((7*hash) % b) % arraySize);
    if (index < 0)
        return index*-1;
    else
        return index;
}

Collision handling:

 private int collisionDoubleHash(int index, long hashCode, String value, String[] table)
 {
     int newIndex = 0;
     int q = previousPrime(table.length);
     int secondFunction = (q - (int)hashCode) % q;
     if (secondFunction < 0)
         secondFunction = secondFunction*-1;
     for (int i = 0; i < table.length; i++)
     {
         newIndex = (index + i*secondFunction) % table.length;
         //System.out.println(newIndex);
         if (isAvailable(newIndex, table))
         {
             table[newIndex] = value;
             return newIndex;
         }
     }
    return -1; 
 }

 private int collisionQuadraticHash(int index, long hashCode, String value, String[] table)
 {
    int newIndex = 0;
    for (int i = 0; i < table.length; i ++)
    {
         newIndex = (index + i*i) % table.length;
         if (isAvailable(newIndex, table))
         {
            table[newIndex] = value;
            return newIndex;
         }
    }
    return -1; 
 }

 public int collisionHandling(int index, long hashCode, String value, String[] table)
 {
    collisionsTotal++;
    collisionsCurrent++;
    if (this.collisionHandlingType.equals("D"))
        return collisionDoubleHash(index, hashCode, value, table);
    else if (this.collisionHandlingType.equals("Q"))
        return collisionQuadraticHash(index, hashCode, value, table);
    else
        return -1;
 }

Get, Put and Remove:

private int getIndex(String k)
{
    long hashCode = getHashCode(k, 0);
    int index = compressHashCode(hashCode, dTable.length);
    if (dTable[index] != null && dTable[index].equals(k))
        return index;
    else
    {
        if (this.collisionHandlingType.equals("D"))
        {
            int newIndex = 0;
            int q = previousPrime(dTable.length);
            int secondFunction = (q - (int)hashCode) % q;
            if (secondFunction < 0)
                secondFunction = secondFunction*-1;
            for (int i = 0; i < dTable.length; i++)
            {
                newIndex = (index + i*secondFunction) % dTable.length;
                if (dTable[index] != null && dTable[newIndex].equals(k))
                {
                    return newIndex;
                }
            }
        }
        else if (this.collisionHandlingType.equals("Q"))
        {
            int newIndex = 0;
            for (int i = 0; i < dTable.length; i ++)
            {
                newIndex = (index + i*i) % dTable.length;
                if (dTable[index] != null && dTable[newIndex].equals(k))
                {
                    return newIndex;
                }
            }
        }
        return -1;
    }
}

public String get(String k)
{
    long hashCode = getHashCode(k, 0);
    int index = compressHashCode(hashCode, dTable.length);
    if (dTable[index] != null && dTable[index].equals(k))
        return dTable[index];
    else
    {
        if (this.collisionHandlingType.equals("D"))
        {
            int newIndex = 0;
            int q = previousPrime(dTable.length);
            int secondFunction = (q - (int)hashCode) % q;
            if (secondFunction < 0)
                secondFunction = secondFunction*-1;
            for (int i = 0; i < dTable.length; i++)
            {
                newIndex = (index + i*secondFunction) % dTable.length;
                if (dTable[index] != null && dTable[newIndex].equals(k))
                {
                    return dTable[newIndex];
                }
            }
        }
        else if (this.collisionHandlingType.equals("Q"))
        {
            int newIndex = 0;
            for (int i = 0; i < dTable.length; i ++)
            {
                newIndex = (index + i*i) % dTable.length;
                if (dTable[index] != null && dTable[newIndex].equals(k))
                {
                    return dTable[newIndex];
                }
            }
        }
        return null;
    }
}

public void put(String k, String v)
{
    double fullFactor = (double)this.size / (double)dTable.length;
    if (fullFactor >= loadFactor)
        resizeTable();

    long hashCode = getHashCode(k, 0);
    int index = compressHashCode(hashCode, dTable.length);

    if (isAvailable(index, dTable))
    {
        dTable[index] = v;
        size++;
    }
    else
    {
        collisionHandling(index, hashCode, v, dTable);
        size++;
    }
}

public String remove(String k)
{
    int index = getIndex(k);
    if (dTable[index] == null || dTable[index].equals("AVAILABLE") || dTable[index].charAt(0) == '-')
        return null;
    else
    {
        if (this.emptyMarkerScheme.equals("A"))
            {
            String val = dTable[index];
            dTable[index] = "AVAILABLE";
            size--;
            return val;
            }
        else if (this.emptyMarkerScheme.equals("N"))
            {
            String val = dTable[index];
            dTable[index] = "-" + val;
            size--;
            return val;
            }
    }
    return null;
}

Hopefully this can give you an idea of my approach. This does not include the Separate Chaining implementation I mentioned above. For this, I had the following inner classes:

private class hashList
{
    private class hashNode
    {
        private String element;
        private hashNode next;

        public hashNode(String element, hashNode n)
        {
            this.element = element;
            this.next = n;
        }
    }

    private hashNode head;
    private int length = 0;

    public hashList()
    {
        head = null;
    }

    public void addToStart(String s)
    {
        head = new hashNode(s, head);
        length++;
    }

    public int getLength()
    {
        return length;
    }   
}

And my methods were modified appropriate to access the element in the head node vs the element in the Array. I took this out, however, since we are not supposed to use Separate Chaining to solve the problem.

Thanks!!

waffledave
  • 35
  • 9
  • Please include examples of what you have tried so far – andrewdleach Nov 30 '15 at 21:32
  • Well, as I mentioned, I have solved the problem using Separate Chaining, but aside from pasting my 1000+ lines of code I'm not sure how to include examples. I'm not looking for code, this is more of a theory-based question. I'm wondering if it is possible to keep track of which values resulted in collisions with each individual index without having some sort of list-based structure in the Array. The only other solution I have come up with is using a multi-dimensional array, but this ruins my space complexity and it doesn't work without knowing the maximum # of collisions at any given index – waffledave Nov 30 '15 at 21:36
  • We need to see code to help though – andrewdleach Nov 30 '15 at 21:37
  • 1
    Just cuirous, have you looked into [wikipedia](https://en.wikipedia.org/wiki/Hash_table) for a start? There is quite a bit of information and different methods there that could give you a nice start. – gonzo Nov 30 '15 at 21:56
  • Thank you for the Wiki link, it is very helpful, though the method described there (Coalesced Hashing) is basically the solution I came up with. I think the goal of the assignment is to only use open addressing but somehow keep track of every individual collision. The only other thing I can think of is to come up with some sort of inverse double-hashing function that can help me identify which values were bumped somewhere else after a collision... But I'm not good enough at math to figure that out. – waffledave Nov 30 '15 at 22:09
  • Question updated with code examples – waffledave Nov 30 '15 at 22:50

0 Answers0