-1

I am trying to implement Huffman Coding and can't figure out how to encode a character using the trie without generating a lookup table. What I don't want to do is generate a map of each character to its encoded string of bits. I am trying to write a method which takes in the root of the Huffman trie and a character and spits out the correct code.

I have written the following code which doesn't work correctly. The problem is, I don't know how to get it to stop after getting the right result. It continues to add to the code after reaching the correct leaf node.

string lookup(HuffNode* root, char c, string prevCode, string direction){
  string currentCode = prevCode + direction;
  if(!root->isLeaf()){
    currentCode = lookup(root->getLeft(), c, currentCode, "0");
    currentCode = lookup(root->getRight(), c, currentCode, "1");
  }else{
    if(root->getChar() == c){
      return currentCode;
    }else{
      return prevCode;
    }
  }

  return currentCode;
}

string encodeChar(HuffNode* trie, char c){
  return lookup(trie, c, "", "");
} 

What would be the correct way to get the encoding of a character from a trie?

  • How does one determine when you have reached the correct leaf, and why isn't your program doing that? – Scott Hunter Feb 09 '19 at 02:56
  • @ScottHunter the correct leaf would be the one whose data member is the same as the character which was passed as a parameter to the `encodeChar` method in the first place. The issue is not so much that the program doesn't know when it has reached the correct leaf as it is that there are still recursive calls on the stack after the program has identified that it has found the correct node. –  Feb 09 '19 at 03:03

1 Answers1

2

It would be terribly inefficient to have to search a good chunk of the tree every single time you want to emit a code. You should in fact make a table of codes for each symbol.

In any case, what you need to do is build up the code in a single string that is passed by reference. Do not return the string. (It will be there when it's all done in the string that was originally passed by reference.) Instead return true or false for if the symbol was found in a leaf. Then when you call lookup() on each branch, see if it returns true. If it does, then immediately return true. For each call, add the appropriate bit to the string (0 or 1). If it returns false for the second branch, remove the bit you added, and return false.

Then the original call will return true if it finds the symbol, and the string will have the code. If it returns false, the symbol wasn't in any of the leaves, and the string will be empty.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • Oh, of course. Reading this, I'm kicking myself for not coming up with that. Thanks for the second set of eyes. –  Feb 09 '19 at 03:24