0

I am trying to do LU decomposition using MPI.

Below is the snapshot of my code:

if(rank == 0)
{
   //Send to each processor the row it owns
    for(p=0;p<n;p++)
    {
       if(map[p]!=0)
       {
         MPI_Send(&LU[p*n],n,MPI_DOUBLE,map[p],1,MPI_COMM_WORLD);
         printf("Sending row %d to %d  itr = %d\n",p,map[p],i);
       }
    }
}       
else
{
  printf("in else rank = %d\n",rank);

  for(l=0;l<n;l++)
  { 
    if(map[l] == rank)
    {
      printf("in loop itr = %d, rank = %d l = %d  n = %d\n",i,rank,l,n);
      MPI_Recv(&LU[l*n],n,MPI_DOUBLE,0,1,MPI_COMM_WORLD,&st);
      printf("Recv row %d at %d count = %d itr = %d\n",l,rank,count,i);
    }
  }
}

In this, if rank is zero, I am sending the row to each processor which will be the owner of this row and will perform the computation.

Otherwise, it will receive the rows corresponding to it. The loop is for multiple rows belonging to the same processor. Also map is a array private to each processor and stores the mapping information of each row.

However, after running my program for a 10*10 matrix for 4 instances,

execution gets blocked

It works fine for 1st iteration of i (all the code is inside this loop) but not for successive iterations.

EDIT:

The above code is a part of the code of LU decomposition. We are trying to achieve the following through the snapshot code

Consider 4 processors P0,P1,P2,P3 and a 10*10 matrix . map will contain 0,1,2,3,0,1,2,3,0,1 which contains which processor is the owner of which row of the matrix . Through the send it is sending each row of the matrix to be worked on by each processor , i.e. P0 would send rows 1,5,9 to P1 , rows 2,6 to P2 and rows 3,7 to P3. Each processor would in turn will receive the rows meant for it, through receive in the else part.

However my execution is getting blocked if I run this code.

user3351750
  • 927
  • 13
  • 24

2 Answers2

0

It looks like your sends and receives aren't matching up.

Remember that the way MPI works is that each time you send a message, there must be a matching receive call on the other end (and vice versa). In your case, you're sending one message from rank 0 to each other process (n sends), and each other process is posting n receives from rank 0. If you count these up, this means you are n-1 sends short on the side of rank 0.

Visually:

0: Send(1)[MATCHED] - Send(2)[MATCHED] - Send(3)[MATCHED]
1: Recv(0)[MATCHED] -     Recv(0)      -     Recv(0)
2: Recv(0)[MATCHED] -     Recv(0)      -     Recv(0)

More than likely, you only need to have one receive call posted by all of the ranks other than 0.

Alternatively, if your model is that you're having rank 0 send a bunch of data to each other rank, a better match for your program would probably be to use MPI_SCATTER. This call will take a large array of data from the root rank (in your case, rank 0), break it up, and send it to all of the other ranks in the communicator. That's probably what you need.

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
  • Thanks for your reply . I actually thought of using MPI_SCATTER but the data which I want to send is not contiguous.For example , if I have 4 processor and a 16*16 array I would like a processor 1 to receive rows 1,5,9,13. Similarly of processor 2 2,6,10,14. I could not figure out a way to do this using MPI_SCATTER. Also did you mean that each process is waiting for n receive . However my if conditin takes care of it and sends only to few others . So I think we have equal number of sends and revc's. – user3351750 Feb 25 '14 at 15:46
  • Ok, if that's the case that your `if` condition filters that issue out, why do you have two `if` statements? It appears that your outer `if` should solve the problem of not having rank 0 listen for a message from itself. After that, it seems that the additional `for` loop and `if` statement are unnecessary (unless there's something else going on in your code that we can't see here). – Wesley Bland Feb 25 '14 at 15:50
  • `if` condition in the else part checks for the recv, so that each processor recv only that row which is meant for it to be received. At the send we are sending each row to a different processor. – user3351750 Feb 25 '14 at 15:58
  • The receiving processes shouldn't have to do that. The sender should only be sending messages to processes that are intended to be received. When rank 0 sends a message to rank 1, there is no danger that rank 2 will receive that message so there isn't any need for rank 2 to do a check to make sure that the message was intended for it (unless I'm misunderstanding what you're saying). – Wesley Bland Feb 25 '14 at 16:00
  • But we need to maintain the count of the number of messages received by the processor , as if we omit the `if` condition then each will wait for n messages from `rank 0` , which is not the case . Am I right in this approach ? – user3351750 Feb 25 '14 at 16:08
  • Ah, I think I understand your code better now. I think it would be helpful to see a minimal working example here. Can you produce something without all of the extra so I can see the context. Note, this does not mean copy and paste all of your code (http://www.sscce.org/). – Wesley Bland Feb 25 '14 at 16:11
  • A text example isn't what would be helpful. In order to figure out what's going on, I need to see a full code example that I can compile and test myself. – Wesley Bland Feb 25 '14 at 16:36
0

Thanks for your help

The reason for blocking was not this loop , but another recv waiting for send.

user3351750
  • 927
  • 13
  • 24