I have used parallel computing before through MPI (and Fortran :)). I would like to use now the parallel capabilities of IPython.
My question is related to the poor performance of the following code, inspired by http://ipython.org/ipython-doc/dev/parallel/asyncresult.html:
from IPython.parallel import Client
import numpy as np
_procs = Client()
print 'engines #', len(_procs)
dv = _procs.direct_view()
X = np.linspace(0,100)
add = lambda a,b: a+b
sq = lambda x: x*x
%timeit reduce(add, map(sq, X))
%timeit reduce(add, dv.map(sq, X))
The results for one processor are:
10000 loops, best of 3: 43 µs per loop
100 loops, best of 3: 4.77 ms per loop
Could you tell me if the results seem normal to you and, if so, why there is such a huge difference in computational time?
Best regards, Flavien.
length # 10000 100 loops, best of 3: 8.68 ms per loop 10 loops, best of 3: 61.5 ms per loop length # 100000 10 loops, best of 3: 89.3 ms per loop 1 loops, best of 3: 558 ms per loop length # 1000000 1 loops, best of 3: 911 ms per loop 1 loops, best of 3: 5.59 s per loop – Flavien Lambert Oct 07 '14 at 02:55