Matlab parfor work distribution

Question

I have a parfor loop through say 100 iterations, and the workload on every iteration is different but changes linearly in a way that the first one takes the most time and the last one is the fastest. But when I run through the parfor loop with my four instances/labs, during the last few hours only one lab is active as it's running through the few first iterations by its own.

So I know which iterations are the slow ones. How could I make workload between cores more even. For example somehow force all labs to start working on the first four slow ones and then proceed in order? Or something similar to prevent only one active core running the few slow ones alone..

I think that the answers to http://stackoverflow.com/questions/9937200/matlab-parallel-computing-toolbox-dynamic-allocation-of-work-in-parfor-loops/9939455#9939455 may be of interest to you. — High Performance Mark, Jul 04 '12 at 08:32
I think [@Edric](http://stackoverflow.com/a/9938666/97160) gave a hint in the question linked: `roughly speaking, PARFOR executes loop iterations in reverse order, so you could put them at the "end" of the loop so work starts on them immediately` — Amro, Jul 04 '12 at 08:50

denahiro · Answer 1 · 2012-07-04T11:07:24.267

2

Matlab parfor does nothing more but split up the indices and distributes them to the workers. It does this by creating contiguous chunks from the indices. I don't know the exact algorithm but this means that data with similar indices get computed in the same chunk and by the same worker.

The simplest solution would be a stochastic one. Just shuffle your indices so that the work intensive steps are distributed nicely. While this doesn't give you any guarantees on performance it is simple and will work most of the time.

Some example code:

% dummy data
N=10;
data=1:N;

% generate the permutated indices
permIndex=randperm(N);

% permute the data
dataPermuted=data(permIndex);

% run the loop
parfor i=1:N
    % do something e.g. pause for the time as specified by data
    pause(dataPermuted(i));
end

%invert the index permutation
dataInversePermuted(permIndex)=dataPermuted;

I used pause to simulate the different computation times.

edited Jul 04 '12 at 11:07

answered Jul 04 '12 at 10:11

denahiro

1,211
7
10

"So worker1 would crunch everything with i1=1:100 and worker2 i2=101:200." - Where do you take this information from? It seems to contradict the answers in [the previously linked question](http://stackoverflow.com/questions/9937200/matlab-parallel-computing-toolbox-dynamic-allocation-of-work-in-parfor-loops/9939455#9939455). – arne.b Jul 04 '12 at 10:55
@arne.b it's true matlab doesn't really split it up like this. My point is that the range of indices gets split up into contiguous chunks. If all time intensive steps are close to each other there is a huge chance that they will be in the same chunk. I'll update the answer. – denahiro Jul 04 '12 at 11:03

score 1 · Answer 2 · answered Jul 04 '12 at 13:12

I don't think this is documented anywhere, but you can quickly deduce that PARFOR runs iterations in reverse loop order (using pause and disp if you want to see it in action). So, you should simply reverse your loop. PARFOR gives you no means to explicitly control execution order, but SPMD using for-drange does (PARFOR is significantly easier to use though).

@denahiro's suggestion is also a good one.

Matlab parfor work distribution

2 Answers2