I want to send large messages in MPI, with more than 2^31 B (or char or double or anything, but I'll use bytes here). The point is to get around the int
limit.
I have this code which sends just a bit more then 2^31 B, by sending 2050 MB.
I use MPI_Probe
and MPI_Get_count
to dynamically receive the size on the receiver side and run the code on 2 ranks.
#include<mpi.h>
#include<stdio.h>
#include<vector>
#include<cassert>
using namespace std;
int main()
{
MPI_Init(NULL, NULL);
int size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Datatype MPI_MEGABYTE_TYPE;
int mega = 1048576;
MPI_Type_contiguous(mega, MPI_BYTE, &MPI_MEGABYTE_TYPE);
MPI_Type_commit(&MPI_MEGABYTE_TYPE);
int count = 2050;
size_t length = static_cast<size_t>(mega) * static_cast<size_t>(count);
vector<char> buffer(length);
if(my_rank == 0) {
MPI_Send(buffer.data(), count, MPI_MEGABYTE_TYPE, 1, 0, MPI_COMM_WORLD);
} else {
{
MPI_Status mpi_status;
int mpi_count = 0;
MPI_Probe(0, 0, MPI_COMM_WORLD, &mpi_status);
MPI_Get_count(&mpi_status, MPI_MEGABYTE_TYPE, &mpi_count);
printf("get count before = %d\n", mpi_count);
}
{
MPI_Status mpi_status;
int mpi_count = 0;
MPI_Recv(buffer.data(), count, MPI_MEGABYTE_TYPE, 0, 0, MPI_COMM_WORLD, &mpi_status);
MPI_Get_count(&mpi_status, MPI_MEGABYTE_TYPE, &mpi_count);
printf("get count after = %d\n", mpi_count);
}
}
MPI_Finalize();
}
Depending on the implementation of MPI I use, I usually get (like with MPICH 3.3.2 or OpenMPI 4.0.3 on Mac, or with other versions of OpenMPI on a linux cluster)
get count before = 2050
get count after = 2050
but sometimes (in particular with MVAPICH2-2.3 MPI
on some linux cluster) I get
get count before = 2048
get count after = 2050
Am I correct in saying that 2048 is wrong (hence that implementation has a bug) and it should be 2050 ? 2050 seems obviously the right answer, but sometimes MPI is tricky. I could have missed something. I haven't been able to find a clear answer in the standard https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf for MPI_Get_count when using composite datatypes.