2

Im trying to better understand how the huffman decoder works. Ive got a code table but im having a hard time trying to understand how the decoder would work because of ambiguity in the binary string.

(im learning this in prep for my final year at uni)

my table:

Data  Hcode
0,     0
1,     1
2,     10
3,     11
17,    100
18,    101
19,    110
29,    111

If i have a huffman code string like 010011 i can return many different combinations of data so how can i discriminate?

i understand the huffman logic in BST representation and you follow a path to a given leaf which the path resembles the code for that given value between (0-255(ascii)) but i still dont know how you can discriminate between returning data: 0,1,0 or data: 0,17

do i really have to enforce 2 bit codes on data 0 and 1? (00 and 01)

i hope ive explained the best i can XD

If your wondering how I generated the table - your gonna kill me because i didnt use tree logic to generate it. Althought i sorted the data (random bytes) on frequency - i generated the Hcodes by converting the element position number into binary (hency why i called this post Poor Mans Huffman).

Many Thanks for any advice.

Gelion
  • 531
  • 7
  • 18

2 Answers2

3

The code table is wrong. Huffman odes are supposed to be prefix free. This is neccessary in order to decode them afterwards without ambiguities.

If you would use a binary tree for creating the codes, this would automatically ensure the "prefix freeness". See: http://en.wikipedia.org/wiki/Huffman_coding

And now, I am going to kill you ... ;)

user492238
  • 4,094
  • 1
  • 20
  • 26
2

Not only is the code table wrong, the lengths of the codes are also wrong. If you have two one-bit codes, you have already used up all of the code space, and can have no other codes. What you have shown is not only not a Huffman code and not a prefix code -- it is in fact not a code at all.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158