When decoding entropy encoded DC values in JPEG (or the entropy encoded prediction differences in lossless JPEG), how do I distinguish between 1
bits that have been stuffed to pad a byte before a marker and a Huffman coded value?
For example if I see:
0xAF 0xFF 0xD9
and I have already consumed the bits in [0xA]
, how can I tell if the next 0xF
is padded or should be decoded?
This is from the JPEG Spec:
F.1.2.3 Byte stuffing
In order to provide code space for marker codes which can be located in the compressed image data without decoding, byte stuffing is used.
Whenever, in the course of normal encoding, the byte value X’FF’ is created in the code string, a X’00’ byte is stuffed into the code string. If a X’00’ byte is detected after a X’FF’ byte, the decoder must discard it. If the byte is not zero, a marker has been detected, and shall be interpreted to the extent needed to complete the decoding of the scan.
Byte alignment of markers is achieved by padding incomplete bytes with 1-bits. If padding with 1-bits creates a X’FF’ value, a zero byte is stuffed before adding the marker.