Passing submatrix between processes

Question

First, I'm learning Message Passing Interface(MPI) from https://computing.llnl.gov/tutorials/mpi/

When it comes to creating your own MPI datatype, I'm having trouble with it.

My program is trying to get each quadrant. Say the following 4 x 4 matrix,

A = {    
      1.0, 2.0,  3.0, 4.0,
      5.0, 6.0,  7.0, 8.0,
      9.0, 10.0, 11.0, 12.0,
      13.0, 14.0, 15.0, 16.0
    }

So I want to divid it into 4 submatrices such that when master sends out 3 submatrices(submatrix 1, 2, 3), each worker can receive it's respective submatrix.

Submatrix 0 |  Submatrix 1
Submatrix 2 |  Submatrix 3

Now, my program only gets the first row of each submatrix and it prints out second row as zeros.

The following is current print out. (You can ignore submatrix 0)

Attached is my program. Any pointer will be greatly appreciated.

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<mpi.h>

//matrix size
#define SIZE 4

double A[SIZE][SIZE] ={
    1.0, 2.0, 3.0, 4.0,
   5.0, 6.0, 7.0, 8.0,
   9.0, 10.0, 11.0, 12.0,
  13.0, 14.0, 15.0, 16.0
};

static double B[SIZE/2][SIZE/2]; 

MPI_Datatype QUAD;
#define QUADRANT(Q,y,x) (Q[y * SIZE/2]+(x * SIZE/2))


void printout(double Y[SIZE/2][SIZE/2]){
    int i,j;
    for(i=0;i< SIZE/2;i++){
        for(j=0; j< SIZE/2; j++){
            printf("%.0f ",Y[i][j]);
        }
        printf("\n");
    }
}


int main(int argc, char** argv){
    int rank, size, i, j;

    MPI_Init(&argc,&argv);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);
    MPI_Comm_size(MPI_COMM_WORLD,&size);
    MPI_Status stat;

    //Define a MPI datatype, Quadrant
    MPI_Type_vector(SIZE/2, SIZE/2, SIZE, MPI_DOUBLE, &QUAD);
    MPI_Type_commit(&QUAD);

    //master process
    if(rank==0){
        MPI_Send(QUADRANT(A,0,1),1,QUAD,1,0, MPI_COMM_WORLD);
        MPI_Send(QUADRANT(A,1,0),1,QUAD,2,0,MPI_COMM_WORLD);
        MPI_Send(QUADRANT(A,1,1),1,QUAD,3,0,MPI_COMM_WORLD);

    }else{
         MPI_Recv(B,1,QUAD,0,0,MPI_COMM_WORLD,&stat);
         printout(B);
         printf("\n");
    }

    MPI_Finalize();
}

There is similar program at https://computing.llnl.gov/tutorials/mpi/samples/C/mpi_vector.c

But trying to get all the numbers in the column matrix.

@MartinZabel can you be more specific? You mean my print out is not correct? So what should I do? — leoflower, Dec 12 '15 at 22:13
@MartinZabel I'm calling `printout(B)`, where B is `static double B[SIZE/2][SIZE/2]`. what do you mean "you actually call `printout( QUADRANT(A,0,1))` " — leoflower, Dec 12 '15 at 22:20
@MartinZabel I don't know what the other way to get the starting address of the quadrant... Can you suggest a macro ? Thank you. — leoflower, Dec 12 '15 at 22:31
@MartinZabel I checked the documentation of MPI_SEND, the first parameter is "initial address of send buffer (choice)". I don't think my Macro is wrong... I think your solution will destroy the purpose of defining a MPI_Datatype Quad... http://www.mpich.org/static/docs/v3.1/www3/MPI_Send.html — leoflower, Dec 12 '15 at 23:21
Sorry, I misunderstood your definition of `QUAD`. I have removed all my incorrect comments and provided a suitable answer. You can remove your replies as well. — Martin Zabel, Dec 13 '15 at 10:52

score 2 · Accepted Answer · answered Dec 13 '15 at 10:46

The bulk of your problem is that what you want to receive isn't a QUAD, but a straight 2x2 sub-matrice. Therefore, the sending part of your code is fine. However, the receiving one is wrong.

So what you need to do to fix you code is either to copy on the sending side your quadrants into straights 2x2 matrices prior to sending, or to allocate a 2x4 receiving buffer of the receiver side to store the message sent, and to copy the relevant part into your 2x2 matrices afterwards.

Here is what the code would look like with the second option, which I chosen for illustration purpose, as you seemed to want to play with derived types. (NB: I kept the code style, although this isn't the one I would use myself)

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<mpi.h>

//matrix size
#define SIZE 4

double A[SIZE][SIZE] ={
    1.0, 2.0, 3.0, 4.0,
    5.0, 6.0, 7.0, 8.0,
    9.0, 10.0, 11.0, 12.0,
    13.0, 14.0, 15.0, 16.0
};

static double B[SIZE/2][SIZE/2]; 
static double tmpB[SIZE/2][SIZE];

MPI_Datatype QUAD;
#define QUADRANT(Q,y,x) (Q[y * SIZE/2]+(x * SIZE/2))

void printout(double Y[SIZE/2][SIZE/2]){
    int i,j;
    for(i=0;i< SIZE/2;i++){
        for(j=0; j< SIZE/2; j++){
            printf("%.0f ",Y[i][j]);
        }
        printf("\n");
    }
}

void compress(double Y[SIZE/2][SIZE/2], double tmpY[SIZE/2][SIZE]){
    int i,j;
    for(i=0;i< SIZE/2;i++){
        for(j=0; j< SIZE/2; j++){
            Y[i][j]=tmpY[i][j];
        }
    }
}

int main(int argc, char** argv){
    int rank, size, i, j;

    MPI_Init(&argc,&argv);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);
    MPI_Comm_size(MPI_COMM_WORLD,&size);
    MPI_Status stat;

    //Define a MPI datatype, Quadrant
    MPI_Type_vector(SIZE/2, SIZE/2, SIZE, MPI_DOUBLE, &QUAD);
    MPI_Type_commit(&QUAD);

    //master process
    if(rank==0){
        MPI_Send(QUADRANT(A,0,1),1,QUAD,1,0,MPI_COMM_WORLD);
        MPI_Send(QUADRANT(A,1,0),1,QUAD,2,0,MPI_COMM_WORLD);
        MPI_Send(QUADRANT(A,1,1),1,QUAD,3,0,MPI_COMM_WORLD);

    }else{
        MPI_Recv(tmpB,1,QUAD,0,0,MPI_COMM_WORLD,&stat);
        compress(B,tmpB);
        printout(B);
        printf("\n");
    }

    MPI_Finalize();
}

A last remark: in real life, if you were to do this sort of transfers, I would encourage you to go for the solution of compressing the data into the quadrant before to send it, to avoid extra potential useless copies inside the MPI library itself (although whether they happen or not is outside of the scope of the MPI standard)

Martin Zabel · Answer 2 · 2015-12-13T11:01:04.987

The problem is, that MPI_Recv is used with the same strided vector data-type which does not conform to the layout of the receive buffer.

For example, the call

MPI_Send(QUADRANT(A,0,1),1,QUAD,1,0, MPI_COMM_WORLD);

together with the definition of QUAD correctly selects the data values of the upper right quadrant of A and sends the values 3.0, 4.0, 7.0, and 8.0 over the network.

But, the same data-type cannot be used for the receive buffer because the size of a row in B, and thus the stride, is smaller than in A. Therefore the value 7.0 and 8.0 are stored beyond the bounds of B:

Matrix as seen by MPI_Recv     Memory Layout of
with data-type QUAD            Matrix B
M[0][0]      <-- 3.0 -->       B[0][0]
M[0][1]      <-- 4.0 -->       B[0][1]
M[0][2]                        B[1][0]    <-- unchanged, e.g. 0.0
M[0][3]                        B[1][1]    <-- unchanged, e.g. 0.0
M[1][0]      <-- 7.0 -->       !beyond array!
M[1][1]      <-- 8.0 -->       !beyond array!
M[1][2]
M[1][3]
...

EDIT: Be be conformant to the standard, one must use the same type upon receive. Thus, the receive buffer must be declared like this:

double B[SIZE/2][SIZE]; // SIZE elements per row.

Afterwards, one can compact the array as done by Gilles in his answer.

Passing submatrix between processes

2 Answers2