Passing and pushing into a vector in MPI_Reduce

Question

I need the reducing node to get a copy of a list of elements (stored in a vector) from the other nodes. I defined my own reducing function but it is not working. The program terminates/crashes.

This is the code:

#include <iostream>
#include "mpi.h"
#include <vector>

using namespace std;

void pushTheElem(vector<int>* in, vector<int>* inout, int *len, MPI_Datatype *datatype)
{
    vector<int>::iterator it;
    for (it = in->begin(); it < in->end(); it++)
    {
        inout->push_back(*it);
    }
}

int main(int argc, char **argv)
{
    int numOfProc, procID;
    vector<int> vect, finalVect;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &numOfProc);
    MPI_Comm_rank(MPI_COMM_WORLD, &procID);

    MPI_Op myOp;
    MPI_Op_create((MPI_User_function*)pushTheElem, true, &myOp);

    for (int i = 0; i < 5; i++)
    {
        vect.push_back(procID);
    }

    MPI_Reduce(&vect, &finalVect, 5, MPI_INT, myOp, 0, MPI_COMM_WORLD);

    if (procID == 0)
    {
        vector<int>::iterator it;
        cout << "Final vector elements: " << endl;

        for (it = finalVect.begin(); it < finalVect.end(); it++)
            cout << *it << endl;
    }

    MPI_Finalize();
    return 0;
}

Please provide some more details about the crash. There should at least be some kind of exception happening... What is the output of the program? — K. Kirsz, Jul 14 '17 at 12:02
This is what I get: job aborted: [ranks] message. [0] process exited without calling finalize. [1] terminated. ---- error analysis ----- testMPI.exe ended prematurely and may have crashed. exit code 0xc0000005. — Jack, Jul 14 '17 at 12:05
@GillesGouaillardet Thank you for your comment. The below code edited by K. Kirsz worked, but there was one issue I wrote about as a comment below his code. Can you please take a look at it. — Jack, Jul 14 '17 at 12:36

score 4 · Accepted Answer · answered Jul 14 '17 at 13:17

It seems you want to collect all elements from all processes. This is not a reduction, it is a gather operation. A reduction combines multiple arrays of the same length to an array of this particular length:

This is not the case, when combining two arrays yields an array of length equal to the sum of input arrays. With MPI, you cannot simply operate with pointers like you try to do in your reduction operation. You cannot send around pointers with MPI, as the processes have separate address space. The MPI interface does use pointers, but only regions of data containing known types and a known size.

You can easily do your task with MPI_Gather.

// vect.size() must be the same on every process, otherwise use MPI_Gatherv
// finalVect is only needed on the root.
if (procID == 0) finalVect.resize(numOfProc * vect.size());
MPI_Gather(vect.data(), 5, MPI_INT, finalVect.data(), 5, MPI_INT, 0, MPI_COMM_WORLD);

score 2 · Answer 2 · answered Jul 14 '17 at 12:11

I don't think you can pass vectors using MPI this way. What MPI does it it takes the first pointer and interprets it as a blob of data of type INT, and defined length. Please think how vector is implemented. The vector itself is just a small control structure that points to some array on heap. So passing vector* you are not providing the pointer to the data, but to this control structure, which then leads to undefined behavior when your program tries to use it as a vector.

You need to operate on raw data with MPI. Try this (not tested since i have no MPI at hand):

#include <iostream>
#include "mpi.h"
#include <vector>

using namespace std;

void pushTheElem(int* in, int* inout, int *len, MPI_Datatype *datatype)
{
    for(inti=0;i<*len;++i){
      inout[i]=in[i];
    }
}

int main(int argc, char **argv)
{
    int numOfProc, procID;
    vector<int> vect, finalVect;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &numOfProc);
    MPI_Comm_rank(MPI_COMM_WORLD, &procID);

    MPI_Op myOp;
    MPI_Op_create((MPI_User_function*)pushTheElem, true, &myOp);

    for (int i = 0; i < 5; i++)
    {
        vect.push_back(procID);
    }
    finalVect.resize(vect.size());
    MPI_Reduce(vect.data(), finalVect.data(), 5, MPI_INT, myOp, 0, MPI_COMM_WORLD);

    if (procID == 0)
    {
        vector<int>::iterator it;
        cout << "Final vector elements: " << endl;

        for (it = finalVect.begin(); it < finalVect.end(); it++)
            cout << *it << endl;
    }

    MPI_Finalize();
    return 0;
}

Thank you very much, now I understand how it works. The code does not produce any error. The only thing is that the finalVect will always have only the elements obtained from the last processed process. The loop in pushTheElem overwrites whatever is already in the finalVect. — Jack, Jul 14 '17 at 12:22
@Jack what is the expected behavior ? can you specify what you expect in finalVect if ran with 2 MPI tasks ? — Gilles Gouaillardet, Jul 14 '17 at 13:02
I basically want the output to contain the elements collected from both processes. @Zulan have clarified to me that what I should be using is MPI_Gather which gave me the correct output. Thank you. — Jack, Jul 14 '17 at 13:43

Passing and pushing into a vector in MPI_Reduce

2 Answers2