To cut a long story short, I'm trying to generate Huffman codes from a canonical Huffman list. Essentially, the following two loops should run, and generate a binary string. The code is:
for (int i = 1; i <= 17; i++) {
for (int j = 0; j < input.length; j++) {
if (input[j] == i) {
result.put(allocateCode(i, j), j); //update a hashmap
huffCode += (1 << (17 - i)); //Update the huffman code
}
}
}
Essentially the code should look for all codes with a length of 1 and generate a Huffman code for each. So for example, lengths of 1 should go (in order): 0, 1. And lengths of three will go 100, 101, 110.
The allocateCode
function simply returns a string that shows the result, the first run produces this:
Huffman code for code 2 is: 0 (0) length was: 1
Huffman code for code 6 is: 10 (2) length was: 2
Huffman code for code 0 is: 1100 (12) length was: 4
Huffman code for code 3 is: 1101 (13) length was: 4
Huffman code for code 4 is: 1110 (14) length was: 4
Huffman code for code 7 is: 11110 (30) length was: 5
Huffman code for code 1 is: 111110 (62) length was: 6
Huffman code for code 5 is: 111111 (63) length was: 6
This is correct, and the right Huffman codes have been generated. However, running it on a second array of lengths produces this:
Huffman code for code 1 is: 0 (0) length was: 1
Huffman code for code 4 is: 1 (1) length was: 1
Huffman code for code 8 is: 100 (4) length was: 3
Huffman code for code 9 is: 100 (4) length was: 3
Huffman code for code 13 is: 101 (5) length was: 3
Huffman code for code 16 is: 1011000 (88) length was: 7
Huffman code for code 10 is: 10110001 (177) length was: 8
Huffman code for code 2 is: 101100011 (355) length was: 9
Huffman code for code 3 is: 101100011 (355) length was: 9
Huffman code for code 0 is: 1011001000 (712) length was: 10
Huffman code for code 5 is: 1011001000 (712) length was: 10
Huffman code for code 6 is: 1011001001 (713) length was: 10
Huffman code for code 7 is: 10110010011 (1427) length was: 11
Huffman code for code 14 is: 10110010011 (1427) length was: 11
Huffman code for code 17 is: 10110010100 (1428) length was: 11
Huffman code for code 19 is: 10110010100 (1428) length was: 11
Huffman code for code 18 is: 101100101010000 (22864) length was: 15
As you can see, the same code is generated multiple times, examples are code 8 & 9, and codes 2 & 3.
I think my problem lies within the nested loops, however I can't figure out why it works perfectly for one run, and fails on another.
I might just be missing something obvious, but I can't see it for looking.
Any advice would be greatly appreciated.
Thanks
UPDATE
After going back through my code, it seems that I was actually making a small mistake when reading in the data in the first place, hence I was getting incorrect Huffman codes!!