I'm trying to implement a multithreaded application based on a slightly altered boss/worker model. Basically the main thread creates several boss threads, which in turn spawn two worker threads each (possibly more). That's because the boss threads deal with one host or network device each, and the worker threads could take a while to complete their work.
I'm using Thread::Pool
to realize this concept, and so far it works quite well; I also don't think my problem is related to Thread::Pool
(see below). Very simplified pseudocode ahead:
use strict;
use warnings;
my $bosspool = create_bosspool(); # spawns all boss threads
my $taskpool = undef; # created in each boss thread at
# creation of each boss thread
# give device jobs to boss threads
while (1) {
foreach my $device ( @devices ) {
$bosspool->job($device);
}
sleep(1);
}
# This sub is called for jobs passed to the $bosspool
sub process_boss
{
my $device = shift;
foreach my $task ( $device->{tasks} ) {
# process results as they become available
process_result() while ( $taskpool->results );
# give task jobs to task threads
scalar $taskpool->job($device, $task);
sleep(1); ### HACK ###
}
# process remaining results / wait for all tasks to finish
process_result() while ( $taskpool->results || $taskpool->todo );
# happy result processing
}
sub process_result
{
my $result = $taskpool->result_any();
# mangle $result
}
# This sub is called for jobs passed to the $taskpool of each boss thread
sub process_task
{
# not so important stuff
return $result;
}
By the way, the reason I'm not using the monitor()
-routine is because I have to wait for all jobs in the $taskpool
to finish. Now, this code works just wonderful, unless you remove the ### HACK ###
line. Without sleeping, $taskpool->todo()
won't deliver the right number of jobs still open if you add them or receive their results too "fast". Like, you add 4 jobs in total but $taskpool->todo()
will only return 2 afterwards (with no pending results). This leads to all sorts of interesting effects.
OK, so Thread::Pool->todo()
is crap, let's try a workaround:
sub process_boss
{
my $device = shift;
my $todo = 0;
foreach my $task ( $device->{tasks} ) {
# process results as they become available
while ( $taskpool->results ) {
process_result();
$todo--;
}
# give task jobs to task threads
scalar $taskpool->job($device, $task);
$todo++;
}
# process remaining results / wait for all tasks to finish
while ( $todo ) {
process_result();
sleep(1); ### HACK ###
$todo--;
}
}
This will also work fine, as long as I keep the ### HACK ###
line. Without this line, this code will reproduce the problems of Thread::Pool->todo()
, as $todo
does not only get decremented by 1, but 2 or even more.
I've tested this code with only one boss thread, so there was basically no multithreading involved (when it comes to this subroutine). $bosspool
, $taskpool
and especially $todo
aren't :shared
, no side effects possible, right? What's happening in this subroutine, which gets executed by only one boss thread, with no shared variables, semaphores, etc.?