I'm trying to parallelize the processing of a file set using bash. I'm using named pipes for keeping number of process fixed and to gather output from the processes.
I'm assuming that the writes to named pipe are atomic, i.e the output of different process is not mixed up. Is that a safe assumption?
Any advice is greatly appreciated. I'm limited to using bash.
Here's the code:
mytask()
{
wItem=$1
#dummy func; process workItem
rt=$RANDOM
st=$rt;
let "rt %= 2"
let "st %= 10"
sleep $st
return $rt
}
parallelizeTask()
{
workList=$1
threadCnt=$2
task=$3
threadSyncPipeD=$4
outputSyncPipeD=$5
ti=0
for workItem in $workList; do
if [ $ti -lt $threadCnt ]; then
{ $task $workItem; if [ $? == 0 ]; then result="success"; else result="failure"; fi; \
echo "$result:$workItem" >&$outputSyncPipeD; echo "$result:$workItem" >&$threadSyncPipeD; } &
((ti++))
continue;
fi
while read n; do
((ti--))
break;
done <&$threadSyncPipeD
{ $task $workItem; if [ $? == 0 ]; then result="success"; else result="failure"; fi; \
echo "$result:$workItem" >&$outputSyncPipeD; echo "$result:$workItem" >&$threadSyncPipeD;} &
((i++))
done
wait
echo "quit" >&$outputSyncPipeD
while read n; do
if [[ $n == "quit" ]]; then
break;
else
eval $6="\${$6}\ \$n"
fi
done <&$outputSyncPipeD;
}
main()
{
if [ ! -p threadSyncPipe ]; then
mkfifo threadSyncPipe
fi
if [ ! -p outputSyncPipe ]; then
mkfifo outputSyncPipe
fi
exec 4<>threadSyncPipe
exec 3<>outputSyncPipe
gout=
parallelizeTask "f1 f2 f3 f4 f5 f6" 2 mytask 3 4 gout
echo "finalOutput: $gout";
for f in $gout; do
echo $f
done
rm outputSyncPipe
rm threadSyncPipe
}
main
I found below related post with answer to my question. I have revised the title to make it more appropriate.
Are there repercussions to having many processes write to a single reader on a named pipe in posix?