2

I am trying to build a generic task system where I can post tasks that get executed on whatever thread is free. With previous attempt I often ran out of threads because they would block at some point. So I am trying boost fibers; when one fiber blocks the thread is free to work on some other fiber, sounds perfect.

The work-stealing algorithm seems to be ideal for my purpose, but I have a very hard time to use it. In the example code fibers get created and only then the threads and schedulers get created, so all the fibers actually get executed on all the threads. But I want to start fibers later and by then all the other threads are suspended indefinitely because they didn't have any work. I have not found any way to wake them up again, all my fibers get only executed on the main thread. "notify" seems to be the method to call, but I don't see any way to actually get to an instance of an algorithm.

I tried keeping pointers to all instances of the algorithm so I could call notify(), but that doesn't really help; most of the time the algorithms in the worker threads cannot steal anything from the main one because the next one is the dispatcher_context.

I could disable "suspend", but threads are busy-waiting then, not an option.

I also tried the shared_work-algorithm. Same problem, once a thread cannot find a fiber it will never wake up again. I tried the same hack manually calling notify(), same result, very unreliable.

I tried using the channels, but AFAICT, if a fiber is waiting for it, the current context just "hops" over and runs the waiting fiber, suspending the current one.

In short: I find it very hard to reliably run a fiber on another thread. When profiling most threads are just waiting on a condition_variable, even though I did create tons of fibers.

As a small testing case I am trying:

std::vector<boost::fibers::future<int>> v;

for (auto i = 0; i < 16; ++i)
    v.emplace_back(boost::fibers::async([i] {
       std::this_thread::sleep_for(std::chrono::milliseconds(1000));
       return i;
    }));

int s = 0;
for (auto &f : v)
    s += f.get();

I am intentionally using this_thread::sleep_for to simulate the CPU being busy.

With 16 threads I would expect this code to run in 1s, but mostly it ends up being 16s. I was able to get this specific example to actually run in 1s just hacking around stuff; but no way felt "right" and no way did work for other scenarios, it always had to be hand-crafted to one specific scenario.

I think this example should just work as expected with a work_stealing algorithm; what am I missing? Is it just a misuse of fibers? How could I implement this reliably?

Thanks, Dix

Dix
  • 106
  • 4
  • Note that I've never used `boost::fibers`. However, if it's anything like `std::async` and `std::future`, then it seems to me you are only starting the fibers when `get` is called. Since `get` is blocking, the first iteration through the `for`-loop will take 1 sec. Then `get` is called on the next element, taking another 1 sec, etc. What if you do `f.wait_for(std::chrono::second(0))` on each element of `v` first? – AVH Dec 12 '17 at 16:03
  • I think technically it is up to the runtime if a std::future gets started on a different thread immediately or just when get() is called. I'll try and see if wait_for could solve the problem for futures; but futures here are just the shortest example I could come up with, most of the times I don't work with them. – Dix Dec 14 '17 at 11:56
  • How many OS threads were backing these fibers? – Brandon Kohn May 06 '18 at 12:05

1 Answers1

1

boost.fiber contains an example using the work_stealing algorithm (examples/work_stealing.cpp).

  1. You have to install the algorithm on each worker-thread that should handle/steal fibers. boost::fibers::use_scheduling_algorithm< boost::fibers::algo::work_stealing >( 4); // 4 worker-threads

  2. Before you process tasks/fibers, you have to wait till all worker-threads have been registered at the algotithm. The example uses a barrier for this purpose.

  3. You need an idication that all work/task has been procesed, for isntance using a condition-variable.

Take a look at Running with worker threads (boost documentation).

xlrg
  • 1,994
  • 1
  • 16
  • 14
  • I did read the documentation and the example is what I used as a base; the problem is when I start new fibers but the worker threads have already done all the initial work and already got suspended indefinitely. The link is to a page that isn't in the current boost version; "Processing tasks" might be a way forward, I'll try extending that to multiple threads – Dix Dec 14 '17 at 11:51
  • worker-threads should spin inside, e.g. they should not be blocked inside work_stealing::suspend_until(). mabye you should try boost-1.66 (includes the documentation mentioned above). – xlrg Dec 14 '17 at 12:35
  • Spinning the threads would work (there's actually a parameter for it), but that means it's using 100%CPU even when there are no fibers to run, that's not an option for me. – Dix Dec 14 '17 at 20:30
  • then you have to call work_stealing::notify() for a worker thread in order to wake it up when new work is available – xlrg Dec 15 '17 at 10:04
  • That's what I hacked in (I don't see any way to do that without supplying your own algorithm); but most of the times (this seems to be purely random) the context it tries to steal from main is a (the?) dispatcher_context and that cannot be stolen. I've had some success by starting a new fiber just to call notify(); but that's just a random "fix", not reliable. – Dix Dec 15 '17 at 11:53