I am currently researching the various possible strategies for implementing an efficient BitStream in pure C. I need this to implement various bit-based compression algorithms. However, I cannot find much literature on the topic and there doesn't seem to be a whole lot of good examples I can find.
Here is what I am looking for:
- Mostly macro-based implementation to avoid function calls
- Functions to read/write 'n' number of bits to/from the BitStream.
- Functions to read/write specific number of bits like 5 bits, optimized over the generic one.
I am wondering about the following:
- Variables which should be maintained in the BitStream. There can be a BYTE pointer, a byte position, a bit index in the current byte, a number of bits left in the current byte, etc.
- How to reduce the number of variables to maintain. The more variables we have, the more variables we need to update.
- How to use as little intermediate/temporary variables in the context of a single read/write operation.
- If operations should be done at a BYTE-level, or at a UINT16-level or UINT32-level. Maybe accumulating bits into a UINT32 and writing the bytes when it's full (or when writing is done, with a flush operation) would be a whole lot faster than doing everything per-byte.
- How can we avoid looping as much as possible. Ideally, we should avoid at all costs to loop over the number of bits to write into the BitStream.
This may look overkill, but when the rest of the code involved in compression has been extremely optimized, it looks like the BitStream part is just spoiling the whole thing. For instance, it's not rare to see assembly routines using SIMD CPU instructions in image compression code to optimize part of the encoding process, but the last step is to write to a BitStream.
Ideas, references, anyone? Thank you!