2

I am trying to code a Huffman string encoding algorithm.

My solution works this way:

  1. since every letter in the string have a special binary code associated to it, search the binary tree and when a letter is found, add it to the map with binary code. (where I went wrong is here)
  2. iterate on the string, and for each letter, associate the value associated with the key of the letter of the map.

I don't have the tree printed somewhere, even tough it could help you help me, but here is what I get for the string abracadabra, and what I should get:

Correct Code : 000010000110110101111101011000111110110100111011101100101101110000110000110111100101111101010010

What I get : 00001000111011010110101111010101100011101011010

Here is my code:

#include <algorithm>
#include <map>

string codes = "";

void getMapCharBinaryCode(Node root, string &prefix, map <char, string> &m){
    if(!root) return;
    if(root->value){
        if(!m.count(root->value)){
            m[root->value] = prefix;
            prefix = "";
        } 
    }
    if(root->leftChild){
        getMapCharBinaryCode(root->leftChild, prefix += "0",m);
    }
    if(root->rightChild){
        getMapCharBinaryCode(root->rightChild, prefix += "1",m);
    }
   
}

string encode(string text, Node tree){
    // text is "abracadabra"
    // create map for each char -> binary code
    map<char, string> m;
    string prefix = "";
    getMapCharBinaryCode(tree, prefix, m);
    
    // iterate on text and assign each letter with binary code from map
    for(int i = 0; i < text.size(); i++) {
        codes += m[text[i]];
    }
    return codes;
}
Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • 3
    First of all try to minimize the input needed to create the faulty output. Then use a debugger to step through the code statement by statement, while monitoring variables and their values, to see when and where things go wrong. – Some programmer dude Oct 30 '20 at 13:14
  • What is the input ```text```? – Abhinav Mathur Oct 30 '20 at 13:15
  • 1
    Please provide a [mre], including suitable input, to demonstrate the problem. – Yunnosch Oct 30 '20 at 13:15
  • `getMapCharBinaryCode(root->leftChild, prefix += "0",m);` -- I could look at this 10 times, and not give an answer of what that line will result in, especially what the second parameter will give the calling function. – PaulMcKenzie Oct 30 '20 at 13:21
  • `getMapCharBinaryCode(root->rightChild, prefix += "1",m);` -- I will take a guess and state that this line leads to unspecified behavior. Parameters in C++ can be processed in any order, not strictly left-to-right. Then you have that `+=` going on in the second parameter -- is that executed after or before the function is called? `¯\_(ツ)_/¯` – PaulMcKenzie Oct 30 '20 at 13:49
  • @PaulMcKenzie The C and C++ result of `+=` in an expression is well defined, and must be executed before the call. What is passed is the value after doing the `+=`. The only ambiguity would be if the first or third argument also depended on `prefix`, in which the ordering of execution of the arguments would affect that. But they do not depend on `prefix`. – Mark Adler Oct 30 '20 at 16:04
  • 4
    Hi Yolan, and welcome to Stack Overflow. I am glad the answer given by Mark worked for you (it's a good answer). However, it's not really our 'style' to add your corrected code to your question - this *could* be seen as somehow 'spoiling' the continuity. It is sufficient that you have marked the answer as "Accepted" - that way, others who view this Q/A will know that it worked as expected. – Adrian Mole Oct 31 '20 at 01:51
  • 2
    Though Yolan can, if they want, post an _answer_ to their own question with the corrected code. – Mark Adler Oct 31 '20 at 02:08

1 Answers1

4

You are destroying the code in prefix when you save a leaf with prefix = "", where the code is needed when dropping back down the tree and going to the next branch.

You can maintain a single storage area for prefix as you have, passing it by reference. However you need to manage the length of prefix as you go up and down the tree, and you need to not add a 0, and then add a 1 to that for the two branches, adding 01 for the right branch instead of 1.

As a starting point, you should just pass prefix by value, which makes copies, but does not require care in management. Get rid of the & and replace the prefix += with prefix +. Get rid of the prefix = "", which then does nothing.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • @YolanMaldonado please do *not* edit the question to include the answer – desertnaut Oct 31 '20 at 01:51
  • That's a very nice suggestion :) I don't think the OP got notified of your comment though. Would it be too much trouble to copy this comment into a comment on the OP's question? I think they'd benefit from it a lot, and they may end up putting up a really answer. Thanks :) – cigien Oct 31 '20 at 04:13
  • 1
    @yolanmaldonado Now that you got that working, you can run a single referenced `prefix` up and down, which is more efficient. You need to add one character for the left branch, then _replace_ that character (not add another) for the right branch, and then _delete_ that character before returning, to restore `prefix` to its state when the routine was entered. Do not mess with `prefix` when processing a leaf. – Mark Adler Oct 31 '20 at 04:43
  • @MarkAdler Thanks for editing the comment to include a ping for the OP, I appreciate it :) – cigien Nov 03 '20 at 15:36