Ipyparallel slow execution with scatter/gather

Question

Context: I have an array that I have scattered across my engines (4 engines at this time), want to apply a function to each point in the array for an arbitrary number of iterations and gather the resulting array from the engines and perform analysis on it.

For example I have the array of data points, that are scattered and the number of iterations on each data point:

data_points = range(16)
iterations = 10
dview.scatter('points', data_points)

I have a user supplied function as such, which is pushed to the engines:

def user_supplied_function(point):
    return randint(0, point)

dview.push(dict(function_one = user_supplied_function))

A list for my results and the parallel execution:

result_list = []
for i in range(iterations):
    %px engine_result = [function_one(j) for j in points]
    result_list.append(dview.gather('engine_result'))

Issue: This works, and I get the result I want from the engines, however as the number of iterations grows the loop takes longer and longer to execute. To the point where 1000 iterations on 50 points takes upwards of 15 seconds to complete. Whereas a sequential version of this task takes less than a second.

Any idea what could be causing this? Could it be the overhead from the message passing from gather()? If so can anyone suggest any solutions?

score 0 · Answer 1 · answered Mar 28 '16 at 02:48

Figured it out. It was the overhead from gather() and .append() after all. The easiest fix is to gather() after the engines have finished their work, as opposed to doing it each iteration.

Solution

%autopx
engine_result = []
for i in xrange(iterations):
    engine_result += [[function_one(j) for j in points]]
%autopx
result_list = list(dview.gather('engine_result'))

This, however, gets the results in a poorly formatted list of lists where the results from each engine are placed next to each other instead of ordered by iteration number. The following commands distribute the lists and flatten the sublists for each iteration.

gathered_list = [None] * iterations
gathered_list = [[result_list[j * iterations + i] for j in xrange(len(result_list) / iterations)] for i in xrange(iterations)]
gathered_list = [reduce(lambda x, y: x.extend(y) or x, z) for z in gathered_list]

Ipyparallel slow execution with scatter/gather

1 Answers1