I was wondering how I can save space writing a bitset to a file ( probably using iostream) in c++. Will breaking up the bitset into bitset of size 8 and then writing each individual bitset to the file save me space? What is your thought about this. This is the intention of data compression.
2 Answers
If you normally write one byte per bit in the bitset, then yes, storing eight elements to a byte will save you 7/8 of the space in the limit (you will have to store the size of the bitset somewhere, of course).
For example, this writes a bitset
using one character per bit (7/8 overhead):
for (size_t i=0, n=bs.size(); i<n; ++i)
stream << bs[i];
while this stores it optimally compact (if we disregard padding at the end):
for (size_t i=0, n=(bs.size() + 1) % 8; i<n; ++i) {
uint8_t byte=0;
for (size_t j=0; j<8; ++j)
byte = (byte << 1) | bs[i*8 + j];
stream << byte;
}
Note that uint8_t
is not standard C++03. It resides in C99's <stdint.h>
or C++0x's <cstdint>
. You can also use an std::bitset<8>
if you want.

- 355,277
- 75
- 744
- 836
-
I'm just not sure of the behaviour of the writing if I write a bitset of lets say size 600. – DogDog Mar 02 '11 at 20:51
-
@Apoc: I don't understand what you're afraid of. Could you post some code? – Fred Foo Mar 02 '11 at 20:52
-
@Apoc: You might want to use a `boost::dynamic_bitset` (http://www.boost.org/doc/libs/release/libs/dynamic_bitset/dynamic_bitset.html) instead if your bitsets are very large and can have variable size. – Emile Cormier Mar 02 '11 at 20:57
-
Do you know if using an `ostream_iterator` will also use a byte per bit? What about using an `ostream_iterator` with `vector
`? – user470379 Mar 02 '11 at 22:21 -
@user470379: neither will write single bits, since an `ostream` simply won't allow that. C++ streams are fundamentally byte-oriented, as are modern operating systems, file systems, network devices, etc. – Fred Foo Mar 02 '11 at 22:27
If you use boost::dynamic_bitset
instead, you can specify the type of the underlying blocks and retrieve them with to_block_range
and from_block_range
functions.
http://www.boost.org/doc/libs/1_46_0/libs/dynamic_bitset/dynamic_bitset.html#to_block_range
(for example, use unsigned char
as block type and store them in a stream in binary mode)

- 1,886
- 1
- 14
- 19