0

How to decode the message from a Huffman Encoded bit stream? I am not clear about the idea of Huffman Algorithm.

As far as I understand it, Suppose I am given a text message "My name is XYZ".

Then the encoding process goes this way: 1. Count the frequency of the characters. 2. Sort the frequency by values. 3. Construct a tree. 4. Traverse the tree by considering left-edge as 0 and right-edge as 1 to get to the intended message character. 5. Concatenate the codes to find the bit stream.

Now the problem is, How can I find the original message from the encoded bit stream?

I think we need to construct the Huffman tree again.

But how can I construct the Huffman tree from the bit stream?

user366312
  • 16,949
  • 65
  • 235
  • 452

2 Answers2

2

Message can't be decoded without original tree. So sending party must include it in message (in case of long messages with overhead will be small) or both parties agree about tree before sending messages. Then process is reverse: you read bit by bit from stream and traverse tree. Once you hit leaf then emit character.

Andrey
  • 59,039
  • 12
  • 119
  • 163
  • Wouldn't it be possible to transmit a dictionary? So when the message is encoded, you send a dictionary and then you can look them up? – Essej Jul 12 '17 at 18:43
  • @Baxtex it is possible of course. The OP implied (or that is how I understood it at least) that they don't have the tree. If the tree is sent first then it is not a problem. – Andrey Jul 12 '17 at 19:13
0

There are a couple of choices for this. One is to used a fixed Huffman tree -- just for example, if you're encoding normal English text, the relative frequencies of characters are usually close enough to constant that you can do pretty reasonably just having the sender and receiver have a prior agreement of the three they'll both use.

For the two-pass Huffman algorithm you describe, you're pretty much stuck with transmitting the tree (in some form) along with the data.

You can also use a dynamic Huffman tree, where you (for example) start with a tree as outlined in the first option above, but as they process the data, the transmitter and receiver both adjust the tree to fit the observed frequencies. The only trick to this is that each character must be encoded using the tree before adjustment, then do the adjustment, then process the next character. This way the receiving end can stay in synch with the sending end.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111