I created a program to read a file and encodes it character by character into a variable length binary code. ex. the most common character would be 110, next common 0010, etc. I then put the entire binary code into a text file. So the code would look something like 11010011001110000110001010110110... and it goes on. How would I go about decoding the binary coding back into characters?
Asked
Active
Viewed 496 times
-1
-
You are probably also going to need the Huffman tree. ;-) – NPE Aug 10 '14 at 19:02
-
Yes, you need to somehow transmit the decoding key along with the encoded data. Several ways to handle this, either as a separate structure at one end or the other of the data, or as inline data, where each new code is identified on first use. – Hot Licks Aug 10 '14 at 19:05
-
Once you can recover the character-code mapping (commonly done by using Canonical Huffman codes and storing just the code lengths), I would suggest table-based decoding. – harold Aug 10 '14 at 19:05
1 Answers
0
If you really generated a Huffman code, then it is a prefix code. That is, no valid code is also the prefix of another valid code. Therefore you simply start at the beginning and whatever code matches those bits is the first symbol. Remove those bits and repeat. Your first symbol is associated with the code 101. Remove the 101, and look in your codes for 1, 10, 100, 1001, 10011, etc. until you find a match.

Mark Adler
- 101,978
- 13
- 118
- 158