Huffman code tables

Question

I didn't understand what do the Huffman tables of Jpeg contain, could someone explain this to me? Thanks

http://en.wikipedia.org/wiki/JPEG explains the format in detail. — macarthy, Feb 10 '11 at 06:37

paxdiablo · Accepted Answer · 2016-06-01T04:24:17.177

Huffman encoding is a variable-length data compression method. It works by assigning the most frequent values in an input stream to the encodings with the smallest bit lengths.

For example, the input Seems every eel eeks elegantly. may encode the letter e as binary 1 and all other letters as various other longer codes, all starting with 0. That way, the resultant bit stream would be smaller than if every letter was a fixed size. By way of example, let's examine the quantities of each character and construct a tree that puts the common ones at the top.

Letter  Count
------  -----
e          10
<SPC>       4
l           3
sy          2
Smvrkgant.  1
<EOF>       1

The end of file marker EOF is there since you generally have to have a multiple of eight bits in your file. It's to stop any padding at the end from being treated as a real character.

                                 __________#__________
                ________________/______________       \
       ________/________                   ____\____   e
    __/__             __\__             __/__       \
   /     \           /     \           /     \     / \
  /       \         /       \         /      SPC  l   s
 / \     / \       / \     / \       / \
y   S   m   v     /   k   g   \     n   t
                 /\          / \
                r  .        a  EOF

Now this isn't necessarily the most efficient tree but it's enough to establish how the encodings are done. Let's first look at the uncompressed data. Assuming an eight-bit encoding, those thirty-one characters (we don't need the EOF for the uncompressed data) are going to take up 248 bits.

But, if you use the tree above to locate the characters, outputting a zero bit if you take the left sub-tree and a one bit if you take the right, you get the following:

Section      Encoding
----------   --------
Seems<SPC>   00001 1 1 00010 0111 0101                    (20 bits)
every<SPC>   1 00011 1 001000 00000 0101                  (22 bits)
eel<SPC>     1 1 0110 0101                                (10 bits)
eeks<SPC>    1 1 00101 0111 0101                          (15 bits)
elegantly    1 0110 1 00110 001110 01000 01001 0110 00000 (36 bits)
.<EOF>       001001 001111                                (12 bits)

That gives a grand total of 115 bits, rounded up to 120 since it needs to be a multiple of a byte, but that's still about half the size of the uncompressed data.

Now that's usually not worth it for a small file like this, since you have to add the space taken up by the actual tree itself^(a), otherwise you cannot decode it at the other end. But certainly, for larger files where the distribution of characters isn't even, it can lead to impressive savings in space.

So, after all that, the Huffman tables in a JPEG are simply the tables that allow you to uncompress the stream into usable information.

The encoding process for JPEG consists of a few different steps (color conversion, chroma resolution reduction, block-based discrete cosine transforms, and so on) but the final step is a lossless Huffman encoding on each block which is what those tables are used to reverse when reading the image.

^(a) Probably the best case for minimal storage of this table would be something like:

Size of length section (8-bits) = 3 (longest bit length of 6 takes 3 bits)
Repeated for each byte:
    Actual length (3 bits, holding value between 1..6 inclusive)
    Encoding (n bits, where n is the actual length)
    Byte (8 bits)
End of table marker (3 bits) = 0 to distinguish from actual length above

For the text above, that would be:

00000011             8 bits

 n  bits   byte
--- ------ -----
001 1      'e'      12 bits
100 0101   <SPC>    15 bits
101 00001  'S'      16 bits
101 00010  'm'      16 bits
100 0111   's'      15 bits
101 00011  'v'      16 bits
110 001000 'r'      17 bits
101 00000  'y'      16 bits
101 00101  'k'      16 bits
100 0110   'l'      15 bits
101 00110  'g'      16 bits
110 001110 'a'      17 bits
101 01000  'n'      16 bits
101 01001  't'      16 bits
110 001001 '.'      17 bits
110 001111 <EOF>    17 bits

000                  3 bits

That makes the table 264 bits which totally wipes out the savings from compression. However, as stated, the impact of the table becomes far less as the input file becomes larger and there's a way to avoid the table altogether.

That way involves the use of another variant of Huffman, called Adaptive Huffman. This is where the table isn't actually stored in the compressed data.

Instead, during compression, the table starts with just EOF and a special bit sequence meant to introduce a new real byte into the table.

When introducing a new byte into the table, you would output the introducer bit sequence followed by the full eight bits of that byte.

Then, after each byte is output and the counts updated, the table/tree is rebalanced based on the new counts to be the most space-efficient (though the rebalancing may be deferred to improve speed, you just have to ensure the same deferral happens during decompression, an example being every time you add byte for the first 1K of input, then every 10K of input after that, assuming you've added new bytes since the last rebalance).

This means that the table itself can be built in exactly the same way at the other end (decompression), starting with the same minimal table with just the EOF and introducer sequence.

During decompression, when you see the introducer sequence, you can add the byte following it (the next eight bits) to the table with a count of zero, output the byte, then adjust the count and re-balance (or defer as previously mentioned).

That way, you do not have to have the table shipped with the compressed file. This, of course, costs a little more time during compression and decompression in that you're periodically rebalancing the table but, as with most things in life, it's a trade-off.

score 0 · Answer 2 · answered Feb 11 '11 at 08:45

The DHT marker doesn't specify directly which symbol is associated with a code. It contains a vector with counts of how many codes there are of a given length. After that it contains a vector with symbol values.

So when you want to decode you have to generate the huffman codes from the first vector and then associate every code with a symbol in the second vector.

Huffman code tables

2 Answers2