2

I have an algorithm where I currently use two unsigned integers as bitmaps to store information about the input; this limits the maximum input size to 64, so I'd like to create a version where the integers are replaced by a bitset or simple big integer. I started writing something using vector<bool>, but looking around on SO, I'm seeing a lot of answers telling me to avoid vector<bool>.

The operations I need:

  • Initialize to all-zeros.
  • Shift left (multiply by two) and set new lsb.
  • Add and set msb.
  • Compare two sets to find smallest/lexicographically first.

When they are created, I know the maximum number of bits, but at first I'll need only 1 bit; then, at every step, one set is shifted left while the other will have a new highest bit added:

{
    a <<= 1;
    a[0] = x;
    b[++msb] = y;
    if (a < b) b = a;
} 

If I create the bitsets with size 1, and then gradually expand them, maybe the comparisons will be quicker than if I immediately set the length to the maximum and have potentially thousands of leading zeros?

So should I continue using vector<bool> or use std::bitset (which unfortunately is fixed-size) or write a simple biginteger implementation capable of just the operations mentioned above using a vector of unsigned ints?


Using vector<bool> you can intialize the vectors with zero-length:

std::vector<bool> a(0), b(0);

and then perform the operations mentioned above like this:

{
    a.push_back(x);
    b.insert(b.begin(), y);
    if (a < b) b = a;
}
  • 2
    I'm not sure any of the containers you mention support the full set of operations you want, but so what? You can implement your own class using one of the standard containers (like others have suggested, I would avoid `vector `) and then change the implementation as circumstances dictate. Isn't this one of the things OO programming is supposed to be about? –  Sep 20 '17 at 21:30
  • 3
    Maybe you should tske a look at [GMP](https://gmplib.org). – Jesper Juhl Sep 20 '17 at 21:40
  • 1
    @JesperJuhl or http://www.boost.org/doc/libs/1_65_1/libs/multiprecision/doc/html/boost_multiprecision/tut/ints/cpp_int.html (which also wraps other libs like GMP http://www.boost.org/doc/libs/1_65_1/libs/multiprecision/doc/html/boost_multiprecision/tut/ints/gmp_int.html – sehe Sep 20 '17 at 22:04
  • 2
    The critique of `vector` is generally that it doesn't store any `bool`s. That makes it impossible to fulfill all the requirements of a container, particularly returning references as `bool&`. If this is not a problem for you, there is nothing else particularly wrong with the type. – Bo Persson Sep 21 '17 at 00:53

2 Answers2

4

I think boost::dynamic_bitset is what you're after.

Here is an example covering your requirements:

#include <iostream>
#include <boost/dynamic_bitset.hpp>
int main() {
    boost::dynamic_bitset<> a(3, 2); // a = 010
    a[0] = true;                     // a = 011
    a.push_back(true);               // a = 1011
    boost::dynamic_bitset<> b = a;   // b = 1011
    a <<= 1;                         // a = 0110
    bool aless = a < b;              // true
    unsigned long al = a.to_ulong(); // al = 6
    std::cout << "a=" << a << ", al=" << a.to_ulong() << "\n"
              << "b=" << b << ", bl=" << b.to_ulong() << "\n"
              << "a<b=" << (a<b) << "\n";
}

A few notes:

  • The object is totally dynamic, with no opportunity to take advantage of your knowledge about a maximum size. I believe it doesn't even use the small object optimisation, so it will always allocate some dynamic memory.
  • The constructor is a bit peculiar. The first parameter is the number of bits, and the second is their value as an integer. That means to initialise to a single true bit, as you requested, you would use dynamic_bitset<>(1, 1). Sadly there is no initializer_list constructor so you can't just do a = {true}. Perhaps the clearest thing would be to default construct the object and push_back(true) on a separate line.
  • push_back affects the most significant bit i.e. the value on the left. That's because "front" means element 0, which is the least significant bit.
  • The shift left operator does not grow the object, so to append an item to the front you need to:
    1. a.push_back(false) (the value you push doesn't matter because it will get thrown away in a moment).
    2. a <<= 1
    3. a[0] = x if you want to set the new value.
  • to_ulong() will only work if the object has few enough elements that it fits in an unsigned long on your platform. Note that it is not an unsigned long long, so even on a 64 bit system it is likely to be 32 bits.
  • There are some other interesting methods worth taking a look at e.g. any(), all() and count().
Arthur Tacca
  • 8,833
  • 2
  • 31
  • 49
  • Can you elaborate on that and provide objective arguments, or is this just an opinion ? – Christophe Sep 20 '17 at 21:54
  • I'm afraid this is just an opinion. My experience with dynamic bitset is that the interface is rather bitset oriented. You know. As in "set of bits" (not: a multi-limb integral type) /cc @Christophe – sehe Sep 20 '17 at 22:04
  • 2
    A guess is *not* an answer. A comment at best. – Jesper Juhl Sep 20 '17 at 22:13
  • 1
    @JesperJuhl I'm sorry my language was too equivocal for you. I did not guess. I knew that `dynamic_bitset` satisfied everything the question had asked for, but didn't presume I had perfectly understood the questioner. – Arthur Tacca Sep 21 '17 at 06:36
  • @Christophe Done – Arthur Tacca Sep 21 '17 at 06:36
  • 1
    @sehe That's true, but it seems to be what the questioner is after. They seem to ask for a lot of bit-oriented operations, and seem to mention integers only for the performance benefits of bit-packing, which dynamic bitset does provide (albeit with a dynamic memory allocation). – Arthur Tacca Sep 21 '17 at 06:38
  • Thanks for expanding the answer. – m69's been on strike for years Sep 21 '17 at 06:46
  • @ArthurTacca agree. Somehow I picked on the "BigInteger" mention in the question a bit too much. – sehe Sep 21 '17 at 07:28
  • @sehe I didn't quite know how best to refer to it, because I'm doing integer stuff like less-than, but also adding a new msb, which feels more like bitset stuff. The dynamic_bitset indeed offers both. The only slight drawback is the two-step left shift. – m69's been on strike for years Sep 21 '17 at 12:48
0

The operations you describe (leaving out the implicit interpretation as an integer) are actually those provided efficiently by a deque. If you can tolerate the memory overhead, you could use std::deque<bool> (std::list<bool> would also work, but would have even higher overhead).

If the overhead is too much, you could start with

struct Bits {
  std::deque<unsigned> deq;
  int ms_free,ls_free;   // unused bits in the two end words
};

and write methods to push bits on either end (for the right end, you would deq.push_back() if lsb_free==0 and store into deq.back() otherwise). Comparison would use deq.size() and ms_free+ls_free to know how to align the two sequences.

Davis Herring
  • 36,443
  • 4
  • 48
  • 76