0

I am new in MATLAB and I just try to execute a parallel small program, but the problem is that the parallel execution take more time than serial ?


close all
clear all
clc

a= rand(1e6,1);
b= rand(1e6,1);
c= zeros(size(a));
d= ones(size(c));
e= zeros(size(d));

tic
tstart=tic;
for i=1:length(a)
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
end
t_normal_for=toc(tstart)

tstart=tic;
parfor i=1:length(a)
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
end
t_parfor=toc(tstart)

***************************************
t_normal_for =

    0.3860


t_parfor =

    2.8403

Can anyone help! In fact I have 4 workers in my computer and the version of MATLAB is R2014a. Another question please, can I send the same function to each worker in my computer?

Thank you in advance Ammar

ammar
  • 33
  • 7
  • where do you open your parallel pool? Opening one takes about 2 seconds, and if that is included in the parfor time Im not surprised it takes longer. – Adriaan Aug 20 '15 at 08:53
  • I just ran the same code and it gives me the same results, as well as when I open my parallel pool before executing the code. It's possible the arrays a and b first get send to all workers, thus replicating your data four times, then the operation is done, then the arrays c,d and e have to be gather to your client. The overhead is rather large. – Adriaan Aug 20 '15 at 08:59

1 Answers1

5

You have 1 big problem: Your code is....bad (no offense). Its not bad code actually, its just not the way you write code in Matlab.

You are writting Matlab code as it if was C, and matlab is not C! In Matlab, c(i)=a(i)+b(i) is c=a+b;. its faster like that, as Matlab is optimized to do those things vectorized.

Before thinking in optimizing your code, try to write it in "Matlab style".

Second problem: Understanding parallel computing.

Parallel computing works when you have HUGE problems, that would take minutes, or most likely hours, and you want to split them into smaller parts. The way it works is you send chunks of work to different processors. And sending information to each of the processors takes A LOT of time, comparing to computing. The reason parallel computing is practical is because the problems generally solved are so computationally expensive that you don't mind spending some extra time on sending and receiving memory chunks.

A 1e6 x 1 matrix is definetly not a big problem, thus unsuited for parallel computing. Especially, if what you are doing is a 15 mathematical operations!!

Most likely your parallel computing code takes 99% of the time to send a(i)-f(i) to each of the workers and the 1% of the time doing the maths. Thus, you are making you r code way slower because you are moving a lot of memory to make a couple of multiplications!

EDIT: What do I mean when I say your code is bad?

Easy example: Write your code vectorized and as you wrote it and compare times

tic

    c =a +b ;
    d =c +b .*a +b ./a +b .*a +b ;
    e =d +c .*a +b ./a +b .*a +b ;
    
     c =a +b ;
    d =c +b .*a +b ./a +b .*a +b ;
    e =d +c .*a +b ./a +b .*a +b ;
    
    c =a +b ;
    d =c +b .*a +b ./a +b .*a +b ;
    e =d +c .*a +b ./a +b .*a +b ;
toc
tic
for i=1:length(a)
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    c(i)=a(i)+b(i);
    d(i)=c(i)+b(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
    e(i)=d(i)+c(i)*a(i)+b(i)/a(i)+b(i)*a(i)+b(i);
end

Result:

Elapsed time is 0.016057 seconds.
Elapsed time is 0.288870 seconds.

Does the same thing. Still the vectorized code is 18~ times faster!

Community
  • 1
  • 1
Ander Biguri
  • 35,140
  • 11
  • 74
  • 120