0

My algoritm produces stream of 9bits and 17bits I need to find solution to store this data in file. but i can't just store 9 bits as int and 17bits as int_32.

For example if my algoritm produces 10x9bit and 5x17bits the outfile size need to be 22bytes.

Also one of the big problem to solve is that the out file can be very big and size of the file is unknown.

The only idea with I have now is to use bool *vector;

Dymsza
  • 11
  • 3
  • If it is always 17 bits followed by 9 bits then just write the 26 bits as an int_32. Extract with bit manipulation. – Duck Nov 08 '12 at 04:49
  • it is randome in fact there may be situations when that are only 9bits – Dymsza Nov 08 '12 at 04:56
  • Is the sequence of M*9bit-vals, and N*17bit-vals defined? I.e. can we assume we're writing all 9bit-vals, then all 17-bitvals, or will they intersperse and that pattern must be preserved? This is *very* important for how you expect to store and read these, is the only reason I ask. – WhozCraig Nov 08 '12 at 07:03

2 Answers2

0

If you have to save dynamic bits, then you should probably save two values: The first being either the number of bits (if bits are consecutive from 0 to x), or a bitmask to say which bits are valid; The second being the 32-bit integer representing your bits.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
0

Taking your example literally: if you want to store 175 bits and it consists of unknown number of entities of two different lengths, then the file absolutely cannot be only 22 bytes. You need to know what is ahead of you in the file, you need the lengths. If you got only two possible sizes, then it can be only a single bit. 0 means 9 bit, 1 means 17 bit.

|0|9bit|0|9bit|1|17bit|0|9bit|1|17bit|1|17bit|...

So for your example, you would need 10*(1+9)+5*(1+17) = 190 bits ~ 24 bytes. The outstanding 2 bits need to be padded with 0's so that you align at byte boundary. The fact that you will go on reading the file as if there was another entity (because you said you don't know how long the file is) shouldn't be a problem because last such padding will be always less than 9 bits. Upon reaching end of file, you can throw away the last incomplete reading.

This approach indeed requires implementing a bit-level manipulation of the byte-level stream. Which means careful masking and logical operations. BASE64 is exactly that, only being simpler than you, consisting only of fixed 6-bit entities, stored in a textfile.

Pavel Zdenek
  • 7,146
  • 1
  • 23
  • 38