Please correct me if I am misunderstanding how MPI_Send
and MPI_Recv
work, since I have just started learning MPI.
My current understanding is that the MPI standard guarantees that two messages which are sent one after another from one sender to one receiver will always appear to the receiver in the same order they were sent. This suggests to me that some kind of queuing must be happening either at the receiver, or the sender, or as part of some distributed state.
I am trying to understand the nature of this queue, so I wrote a simple pingpong
program where all the odd-ranked nodes would send and receive with the even-ranked node whose node number was under it.
The idea is that if there is a global queue shared across all the nodes in the cluster, then running with a higher number of nodes should substantially increase the latency observed at each node. On the other hand, if the queue is at each receiver, then the latency increase should be relatively small. However, I get very mixed results, so I am not sure how to interpret them.
Can someone provide an interpretation of the following results, with respect to where the queue is resident?
$ mpirun -np 2 simple
Rank = 0, Message Length = 0, end - start = 0.000119
$ mpirun -np 2 simple
Rank = 0, Message Length = 0, end - start = 0.000117
$ mpirun -np 4 simple
Rank = 2, Message Length = 0, end - start = 0.000119
Rank = 0, Message Length = 0, end - start = 0.000253
$ mpirun -np 4 simple
Rank = 2, Message Length = 0, end - start = 0.000129
Rank = 0, Message Length = 0, end - start = 0.000303
$ mpirun -np 6 simple
Rank = 4, Message Length = 0, end - start = 0.000144
Rank = 2, Message Length = 0, end - start = 0.000122
Rank = 0, Message Length = 0, end - start = 0.000415
$ mpirun -np 8 simple
Rank = 4, Message Length = 0, end - start = 0.000119
Rank = 0, Message Length = 0, end - start = 0.000336
Rank = 2, Message Length = 0, end - start = 0.000323
Rank = 6, Message Length = 0, end - start = 0.000287
$ mpirun -np 10 simple
Rank = 2, Message Length = 0, end - start = 0.000127
Rank = 8, Message Length = 0, end - start = 0.000158
Rank = 0, Message Length = 0, end - start = 0.000281
Rank = 4, Message Length = 0, end - start = 0.000286
Rank = 6, Message Length = 0, end - start = 0.000278
This is the code that implements the pingpong.
#include "mpi.h" // MPI_I*
#include <stdlib.h>
#define MESSAGE_COUNT 100
int main(int argc, char* argv[]){
if (MPI_Init( &argc, &argv) != MPI_SUCCESS) {
std::cerr << "MPI Failed to Initialize" << std:: endl;
return 1;
}
int rank = 0, size = 0;
// Get processors ID within the communicator
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
size_t message_len = 0;
char* buf = new char[message_len];
MPI_Status status;
// Pingpong between even and odd machines
if (rank & 1) { // Odd ranked machine will just pong
for (int i = 0; i < MESSAGE_COUNT; i++) {
MPI_Recv(buf, (int) message_len, MPI_CHAR, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
MPI_Send(buf, (int) message_len, MPI_CHAR, rank - 1, 0, MPI_COMM_WORLD);
}
}
else { // Even ranked machine will ping and time.
double start = MPI_Wtime();
for (int i = 0; i < MESSAGE_COUNT; i++) {
MPI_Send(buf, (int) message_len, MPI_CHAR, rank + 1, 0, MPI_COMM_WORLD);
MPI_Recv(buf, (int) message_len, MPI_CHAR, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
}
double end = MPI_Wtime();
printf("Rank = %d, Message Length = %zu, end - start = %f\n", rank, message_len, end - start);
}
delete[] buf;
MPI_Finalize();
return 0;
}