2

I'm learning MPI code. I am trying to do a pipelined ring broadcast using different sized chunks. However, when I run my code, it reaches a deadlock while Process 0 attempts to send the second chunk of data, and I have no idea why. Any help would be appreciated.

NOTE: this is part of a much larger code. It fills a buffer with chars on Process 0. After some simple debugging using print statements, I believe there is something wrong with line 9 (marked with an ***) because that's where the program stalls. The second chunk of data is never sent from Process 0.

int offset;
MPI_Status status;

if (rank == 0) {
    offset = 0;
    while (offset < NUM_BYTES) {
        MPI_Send(&chunk_size, 1, MPI_INT, rank + 1, 3, MPI_COMM_WORLD);
        MPI_Send(&offset, 1, MPI_INT, rank + 1, 2, MPI_COMM_WORLD);
        MPI_Send(&buffer[offset], chunk_size, MPI_BYTE, rank + 1, 1, MPI_COMM_WORLD); ***
        offset = offset + chunk_size;
        if ((offset + chunk_size) >= NUM_BYTES) {
            chunk_size = (NUM_BYTES - offset);
        }
    }
}
else {
    MPI_Recv(&chunk_size, 1, MPI_INT, rank - 1, 3, MPI_COMM_WORLD, &status);
    MPI_Recv(&offset, 1, MPI_INT, rank - 1, 2, MPI_COMM_WORLD, &status);
    MPI_Recv(&buffer[offset], chunk_size, MPI_BYTE, rank - 1, 1, MPI_COMM_WORLD, &status);
    if (rank != num_procs - 1) {
        MPI_Send(&chunk_size, 1, MPI_INT, rank + 1, 3, MPI_COMM_WORLD);
        MPI_Send(&offset, 1, MPI_INT, rank + 1, 2, MPI_COMM_WORLD);
        MPI_Send(&buffer[offset], chunk_size, MPI_BYTE, rank + 1, 1, MPI_COMM_WORLD);
    }
}
iltp38
  • 519
  • 2
  • 5
  • 13
  • 2
    Just a remark: as MPI preserves the chronological order of messages with the same tag sent between two processes within a given communicator, you don't actually need to send the offset. Simply keep a counter and increment it each time with the amount of data received. Also, sending the chunk size is redundant as you could simply use `MPI_Probe` to probe for a message and `MPI_Get_count` to get the number of elements in the [probed] message. – Hristo Iliev Oct 21 '15 at 06:47

1 Answers1

3

The code looks fine (although not very effective since all communications are serialised) but you have a big miss: only process #0 communicates in a loop so it will send several times whereas all other processes will expect only one single set of communications. Add the same while loop in the else part, and that should work.

Gilles
  • 9,269
  • 4
  • 34
  • 53
  • Thank you! I was able to fix my problem by adding the while loop that you suggested as well as adjusting it to check if ((offset + chunk_size) < NUM_BYTES) rather than (offset < NUM_BYTES) since a non-master will never receive an offset greater than NUM_BYTES. – iltp38 Oct 21 '15 at 07:15