I've been studying RFC 1951 (the DEFLATE compression algorithm) and found the interesting "learning" source file by Mark Adler with a cute name -- puff.c. And while brain parsing that file has been an illuminating experience with reasonable following of the RFC-1951 spec, the author does something which is not really clear to me -- he permutates the order by which he indexes into the length array.
The explanation is a bit vague in the comments of the code:
The code length code lengths are received in a permuted order (see the order[] array below) to make a short code length code length list more likely. As it turns out, very short and very long codes are less likely to be seen in a dynamic code description, hence what may appear initially to be a peculiar ordering.
And here's the permutation array described upstairs:
static const short order[19] = /* permutation of code length codes */
{16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15};
So, my questions are:
- Why is there no mention of this in RFC-1951?
- How did they decide upon the exact permutation laid out in
order
? - How does it work? Why doesn't the decoding process fail since the order has changed?
Much obliged. This isn't homework nor am I trying to outmatch the original algorithm, just for the learning experience. I searched this website and Google was my friend, but to no avail.