5

I have a dataset which I'd like to consume with tbb::parallel_for in intervals of size interval_size. Each interval that my functor consumes should be interval_size, except for the last partial interval, which may be smaller when interval_size does not evenly divide my dataset.

Is there a way to use TBB to statically partition in this manner? This test produces several intervals smaller than interval_size on my system:

#include <tbb/parallel_for.h>
#include <iostream>

struct body
{
  void operator()(const tbb::blocked_range<size_t> &r) const
  {
    std::cout << "range size: " << r.end() - r.begin() << std::endl;
  }
};

int main()
{
  size_t num_intervals = 4;
  size_t interval_size = 3;

  // consume num_intervals plus a partial interval in total
  size_t n = num_intervals * interval_size + (interval_size - 1);
  tbb::parallel_for(tbb::blocked_range<size_t>(0, n, interval_size),
                    body(),
                    tbb::simple_partitioner());

  return 0;
}

The output:

$ g++ test_parallel_for.cpp -ltbb
$ ./a.out 
range size: 3
range size: 2
range size: 2
range size: 3
range size: 2
range size: 2
Jared Hoberock
  • 11,118
  • 3
  • 40
  • 76

1 Answers1

4

The reason for the behaviour is that the simple partitioner partitions your range by the following criteria:

ceil(grainsize/2) <= chunksize <= grainsize

when used with tbb::blocked_range(i, j, grainsize) and chunksize is the size of your range.

You can check the Tutorial for more information under 3.2.5 Partitioner Summary.

There is no easy way to get a fixed size chunksize with TBB (you can easily achieve this with OpenMP). That is because this is against the concepts of TBB. TBB tries to abstract all those stuff away from you and the scheduler makes sure that your threads are used the best as possible on runtime.

Stephan Dollberg
  • 32,985
  • 16
  • 81
  • 107