11

I have a "main" function that performs many small, independent tasks each once per time step. However, after each time step, I must wait for all of the tasks to complete before stepping forward.

I want to make the program multithreaded. I have tried implementations with the boost-offshoot threadpool, and I've tried using a vector of (shared pointers to) threads, and I've tried the asio threadpool ideas (using an io_service, establishing some work, then distributing run to the threads and posting handlers to the io_service).

All of these seem to have a lot of overhead creating and destroying threads for my "many small tasks," and I want a way, preferably using the asio tools, to instantiate one io_service, one thread_group, posting handlers to the io_service, and waiting for a single time step's work to be finished before posting more tasks. Is there a good way to do this? Here's (stripped down) code for what I have working now:

boost::asio::io_service io_service;
for(int theTime = 0; theTime != totalTime; ++theTime)
{
    io_service.reset();
    boost::thread_group threads;
    // scoping to destroy the work object after work is finished being assigned
    {
        boost::asio::io_service::work work(io_service);
        for (int i = 0; i < maxNumThreads; ++i)
        {
            threads.create_thread(boost::bind(&boost::asio::io_service::run,
                &io_service));
        }

        for(int i = 0; i < numSmallTasks; ++i)
        {
            io_service.post(boost::bind(&process_data, i, theTime));
        }
    }
    threads.join_all(); 
}

Here's what I had rather have (but don't know how to implement):

boost::asio::io_service io_service;
boost::thread_group threads;
boost::asio::io_service::work work(io_service);
for (int i = 0; i < maxNumThreads; ++i)
{
    threads.create_thread(boost::bind(&boost::asio::io_service::run,
         &io_service));
}

for(int theTime = 0; theTime != totalTime; ++theTime)
{
    for(int i = 0; i < numSmallTasks; ++i)
    {
        io_service.post(boost::bind(&process_data, i, theTime));
    }
    // wait here until all of these tasks are finished before looping 
    // **** how do I do this? *****
}
// destroy work later and join all threads later...
Sam Miller
  • 23,808
  • 4
  • 67
  • 87
John Doe
  • 301
  • 3
  • 11
  • This is not as simple as calling io_service.stop() inside the "time" for loop, after all the tasks have been posted, is it? The docs don't seem to indicate that all posted handlers will be executed before stopping... – John Doe Oct 30 '12 at 20:58

3 Answers3

11

You may use futures for data processing and synchronize with them using boost::wait_for_all(). This will allow you to operate in terms of parts of work done, not threads.

int process_data() {...}

// Pending futures
std::vector<boost::unique_future<int>> pending_data;

for(int i = 0; i < numSmallTasks; ++i)
{
   // Create task and corresponding future
   // Using shared ptr and binding operator() trick because
   // packaged_task is non-copyable, but asio::io_service::post requires argument to be copyable

   // Boost 1.51 syntax
   // For Boost 1.53+ or C++11 std::packaged_task shall be boost::packaged_task<int()>
   typedef boost::packaged_task<int> task_t;

   boost::shared_ptr<task_t> task = boost::make_shared<task_t>(
      boost::bind(&process_data, i, theTime));

   boost::unique_future<int> fut = task->get_future();

   pending_data.push_back(std::move(fut));
   io_service.post(boost::bind(&task_t::operator(), task));    
}

// After loop - wait until all futures are evaluated
boost::wait_for_all(pending_data.begin(), pending_data.end()); 
Rost
  • 8,779
  • 28
  • 50
  • Ok, this is sounding promising. I need to look into futures. Thank you for pointing me in this new direction. – John Doe Oct 30 '12 at 21:29
  • Rost, I'm having trouble implementing this without std::move (I'm stuck on a C++03 compiler). How do I get around using std::move? – John Doe Oct 31 '12 at 00:45
  • @JohnDoe Try to use `boost::move` instead – Rost Oct 31 '12 at 07:01
  • Instead of `std::unique_ptr` use `std::auto_ptr` in C++03, its copy constructor is actually a move constructor in C++03 – BigBoss Oct 31 '12 at 09:32
  • Ok, I've gotten halfway there... I used a shared_future instead for the std::vector (and the copy constructor for shared_future from task.get_future()). But I can't for the life of me figure out how to post a packaged_task to the io_service (time for a new question). This seems like the right way to go, so I'll mark yours as the best answer. Thanks for all the help! – John Doe Oct 31 '12 at 11:28
  • New question is here: [link](http://stackoverflow.com/questions/13157502/how-do-you-post-a-boost-packaged-task-to-an-io-service-in-c03). Thanks @Rost for getting me on this path. – John Doe Oct 31 '12 at 12:04
  • @Rost Maybe I am missing something here but I am currently facing problems which made me stumble upon this. Did you try to compile your code? `std/boost::packaged_task`s are uncopyable and unassignable and as `io_service::post` takes the completion handler by value this should actually not compile. – Stephan Dollberg Mar 10 '13 at 22:24
  • @bamboon You are right, this code is not correct, it's just to illustrate idea. Workaround here: http://stackoverflow.com/a/13158515/1599260 – Rost Mar 11 '13 at 09:55
  • @Rost Ah thanks, that's a nice trick. I think in your typedef it should be `boost::packaged_task`. – Stephan Dollberg Mar 11 '13 at 10:46
  • @bamboon It shall be `int` because actual future type is `int` here. I'm stick for Boost 1.51 where variadic template arguments version of `packaged_task` is not implemented yet. But of course, for Boost 1.53+ or well-implemented `std::packaged_task` it shall be ``. – Rost Mar 11 '13 at 11:06
0

may be you can use boost::barrier as follow:

void thread_proc( boost::barrier& b ) {
    while( true ) {
        if( !ioservice.run_one() ) break; // io_service stopped
        b.wait();
    }
}
BigBoss
  • 6,904
  • 2
  • 23
  • 38
  • Interesting. I might be able to use a barrier if I could have some sort of condition for each thread that checked whether there were any more handlers posted to each thread's queue. Is there such a thing? – John Doe Oct 30 '12 at 19:58
  • You may have : `while( true ) { if( i_should_do_action() && !ioservice.run_once() ) break; b.wait(); }` – BigBoss Oct 30 '12 at 20:49
  • I'm not sure I understand. With the "work" object instantiated, the io_service won't ever be in a stopped state, will it? That is, will the call to io_service.run_once() ever return 0 in the setup above? – John Doe Oct 30 '12 at 21:06
  • This is bad solution. It's not correct, not scalable, defeats the idea of multithreading. What if thread count is much less than parts of work to be done? I suppose it is. Each thread will execute one part of work and block until all other threads done. But why??? It must process another piece of work! – Rost Oct 30 '12 at 21:22
  • Also consider you have N threads and M parts of work. N < M. With your barrier you will stop waiting after N parts of work are done, but the rest M-N parts will not be done at the moment barrier is released! – Rost Oct 30 '12 at 21:25
  • @Rost I assume you want a solution that solve all problems of the world!? I have a solution for the question, in question we have N threads and N jobs and indicated that we want to do some jobs, wait for them to finish then start a new set of jobs! If it is strange to you, ask JohnDoe not me!! – BigBoss Oct 30 '12 at 22:52
  • No need for the flame war. I did envision there being a small number of threads (under 20) and many small tasks (10,000 or so), so it's likely that I'll need to use Rost's solution instead. However, BigBoss, I still don't know what to do with the fact that io_service.run_once() will probably never return 0 in my case. Or will it? – John Doe Oct 30 '12 at 23:40
  • I don't want to start war either! this is stackoverflow and we are here to learn, I accept that @Rost version is much better for different number of threads and tasks, but this was the first thing that reach my mind in a second, a replacement for create/terminate threads. that's all! negative point is for answer that say something wrong according to the question, not something that can be better. never mind, if I were you I will select Rost solution, since it is better and we are here for best :) – BigBoss Oct 31 '12 at 00:28
  • @JonDoe `run_one` will return 0 after you destroy the `work` object – Rost Oct 31 '12 at 07:22
0

Rost's method essentially works, but the boost::make_shared can not compile as is. The following is a working version (in vs2012):

#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <boost/make_shared.hpp>
#include <boost/function_types/result_type.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/function.hpp>
#include <boost/thread.hpp>

std::vector<boost::unique_future<void>> pending_data;
typedef boost::packaged_task<void> task_t;

boost::shared_ptr< boost::packaged_task<void> > pt(new boost::packaged_task<void> ([&,i](){...}));
boost::unique_future<void> result = pt->get_future();
pending_data.push_back(boost::move(result));
io_service.post(boost::bind(&task_t::operator(), pt));

boost::wait_for_all(pending_data.begin(), pending_data.end()); 
pending_data.clear();

It will not compile if use argument in the packaged_task typedef. This thread pool by asio and future method only saved 8% time compared with each loop create new thread methods.

Frank
  • 505
  • 5
  • 14