Implementing hashtables in C++ (insertion and lazy deletion)

Question

I am implementing a Hashtable class in C++. The collision resolution method that I am using is linear probing with lazy deletion. I have seen implementations of this but had a question with regards to the insert method. Each cell of the hashtable has a state (active, deleted, empty). For some reason in the implementation I have seen when inserting a new element, they hash the key and then probe the table until an EMPTY cell is found (or until a cell already containing the same key is found).

Example Code:

int findPos(const string &key){
     int currentPos=hash(key);
     while(data[currentPos].state!=EMPTY && data[currentPos].key!=key){
         currentPos++;
         if (currentPos>=data.size())
            currentPos-=data.size()
         }
      return currentPos;
}

bool insert(const string &key){
     int currentPos=findPos(key);
     if (isActive(currentPos))
          return false; //already exists
     data[currentPos]=hashEntry(key,ACTIVE);
     if (++currentSize>data.size()/2)
          rehash();
     return true;   //element inserted
}

My question is: Is there a reason not to stop and insert into a cell that has been tagged as deleted? In other words, in the findPos method, why not change the while statement to while(data[currentPos].state==ACTIVE && data[currentPos].key!=key) so that the loop ends either when we find the cell with the key or the first deleted/empty cell. Then in the insert, test the state of the cell. If active the entry already exists, so return false; else insert the element.

score 3 · Accepted Answer · answered Sep 16 '12 at 16:47

3

The key could have been inserted further on, and later one of the intervening cells could have been marked as deleted. Then you would insert a duplicate instance of the same key.

answered Sep 16 '12 at 16:47

Thomas Padron-McCarthy

27,232
8
51
75

score 0 · Answer 2 · answered Sep 16 '12 at 18:04

Probably your reference code did not have lazy delete, or the DELETED state was added to an existing implementation. Yes you can safely "undelete" an entry for your key. But make sure this algorithm is used consistently, to avoid the situation described in @Thomas' answer.

Implementing hashtables in C++ (insertion and lazy deletion)

2 Answers2