3

I am writing a C++ program and I need to estimate the memory occupied by a boost::dynamic_bitset<>. The sizeof() operator always returns 32 as the value, but I'm doubtful whether thats always the case, especially when the number of bits exceeds 1 million.

What is the better way to estimate this?

Thanks in advance.

Giridhar
  • 155
  • 1
  • 8
  • Read about vector's sizeof https://stackoverflow.com/questions/34034849/what-is-the-size-of-sizeofvector-c, it's the same issue. `sizeof` is "size of an interface", not how much memory structure uses, so it doesn't matter how many elements are there in structure. – fas Jan 12 '21 at 05:28

2 Answers2

3

You can expect the actual memory usage to be (in bytes):

  • sizeof the object itself (which you've said is 32 bytes in your case) plus
  • however-many bits you've constructed the dynamic_bitset to divided by 8, probably rounded up somewhat by your dynamic memory allocation library, but at worst probably to the next power-of-two in bytes

If you push_back or append to the dynamic_bitset to increase its length gradually, it'll probably double the memory usage periodically, same way e.g. std::vector and std::unordered_map do, but I haven't checked the code or docs for that. That's generally considered the sane compromise between excessive copying and being wasteful of memory. If you want to check, look at the source code.

At runtime, you can get a more accurate idea of the bytes current allocated using by calling .capacity() and dividing by 8 (but there can still be a little overhead from the allocation library, and there's the fixed number of bytes per sizeof - 32 in your case):

size_type capacity() const;

Returns: The total number of elements that *this can hold without requiring reallocation. Throws: nothing.

See: https://www.boost.org/doc/libs/1_75_0/libs/dynamic_bitset/dynamic_bitset.html

You can divide by 8 because you can fit 8 bits in a single byte. (Strictly speaking it's better to use CHAR_BIT instead of hard-coding 8).

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
1

The general answer, here, IMO is to measure.

You do this using a memory profiler. Example: Valgrind Massif

Just the other day I did for this comment at a closed question:

@RetiredNinja just ran with 20gibibit, no problem (except it took 5 minutes) and peak allocation of 20.95GiB (18.63GiB in a std::string) – sehe 22 hours ago

The code was:

#include <boost/dynamic_bitset.hpp>
#include <boost/lexical_cast.hpp>
#include <iostream>

int main() {
    boost::dynamic_bitset<> dbs(20'000'000'000);
    for (size_t i =0; i < dbs.size(); ++i) {
        dbs.set(i, rand()%2);
    }
    std::string s;
    to_string(dbs, s);
    std::cout << "Size: " << (dbs.size() >> 30) << "gibibit\n";
    std::cout << s.substr(0,10) << " ... " << s.substr(s.size() - 10, 10) << "\n";
}

The command I used:

time valgrind --tool=massif ./sotest 

And the result is a file massif.out.10060 that you can analyze using ms_print:

--------------------------------------------------------------------------------
Command:            ./sotest
Massif arguments:   (none)
ms_print arguments: massif.out.10060
--------------------------------------------------------------------------------


    GB
20.95^                                                                       #
     |                                                            :::::::::::#
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |                                                            :          #
     |:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::          #
     |:                                                           :          #
   0 +----------------------------------------------------------------------->Ti
     0                                                                   1.629

Number of snapshots: 10
 Detailed snapshots: [5 (peak)]

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  0              0                0                0             0            0
  1      2,280,103           72,712           72,704             8            0
  2      2,390,775    2,500,074,448    2,500,072,704         1,744            0
  3 1,508,985,532,574   22,500,076,440   22,500,072,705         3,735            0
  4 1,791,172,952,129   22,500,077,472   22,500,073,729         3,743            0
  5 1,791,172,960,303   22,500,077,472   22,500,073,729         3,743            0
100.00% (22,500,073,729B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->88.89% (20,000,000,001B) 0x4F6E408: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
| ->88.89% (20,000,000,001B) 0x4F6EE3E: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_aux(unsigned long, unsigned long, unsigned long, char) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
|   ->88.89% (20,000,000,001B) 0x1094AB: void boost::to_string_helper<unsigned long, std::allocator<unsigned long>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(boost::dynamic_bitset<unsigned long, std::allocator<unsigned long> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, bool) (basic_string.h:1453)
|     ->88.89% (20,000,000,001B) 0x109067: main (dynamic_bitset.hpp:1273)
|       
->11.11% (2,500,000,000B) 0x108F83: main (new_allocator.h:115)
| 
->00.00% (73,728B) in 1+ places, all below ms_print's threshold (01.00%)

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  6 1,791,172,960,303    2,500,075,480    2,500,073,728         1,752            0
  7 1,791,172,960,356           73,744           73,728            16            0
  8 1,791,172,971,034            1,032            1,024             8            0
  9 1,791,172,972,716                0                0             0            0

Or you can use a GUI tool like massif_visualizer:

enter image description here

For more examples see my other answers using it on this site: https://stackoverflow.com/search?tab=votes&q=user%3a85371%20massif

sehe
  • 374,641
  • 47
  • 450
  • 633