2

I have a very simple MPI program to test the behavior of MPI_Reduce. My objectives are simple:

~ Start by having each process create a random number (range 1-100) Then run program with mpirun -np 5 <program_name_here>

  • Have process 0, find the sum of all 5 numbers
  • Have process 1, find the product of all 5 numbers
  • Have process 2, find the max of all 5 numbers
  • Have process 3, find the min of all 5 numbers
  • Have process 4, find the bitwise and of all 5 numbers

And here's my program:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <time.h>

int sum = 0;
int product = 0;
int max = 0;
int min = 0;
int bitwiseAnd = 0;

int main ( int argc, char **argv )
{
   int my_id, num_procs;
   MPI_Init(&argc, &argv);

   MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
   MPI_Comm_size(MPI_COMM_WORLD, &num_procs);

   int num;
   srand(time(NULL) * my_id);
   num = rand() % 100; //give num the random number   

   printf("Process #%i: Here is num: %i\n",my_id,num);


   if(my_id == 0){
      printf("Okay it entered 0\n");
      MPI_Reduce(&num, &sum,1,MPI_INT,MPI_SUM, 0, MPI_COMM_WORLD);
   }else if(my_id == 1){
      printf("Okay it entered 1\n");
      MPI_Reduce(&num, &product,1,MPI_INT,MPI_PROD, 0, MPI_COMM_WORLD);
   }else if(my_id == 2){
      printf("Okay it entered 2\n");
      MPI_Reduce(&num, &max,1,MPI_INT,MPI_MAX, 0, MPI_COMM_WORLD);
   }else if(my_id == 3){
      printf("Okay it entered 3\n");
      MPI_Reduce(&num, &min,1,MPI_INT,MPI_MIN, 0, MPI_COMM_WORLD);
   }else if(my_id == 4){
      printf("Okay it entered 4\n");
      MPI_Reduce(&num, &bitwiseAnd,1,MPI_INT,MPI_BAND, 0, MPI_COMM_WORLD);
   }

   MPI_Barrier(MPI_COMM_WORLD);

   if(my_id == 0){
      printf("I am process %i and the sum is %i\n",my_id,sum);
      printf("I am process %i and the product is %i\n",my_id,product);
      printf("I am process %i and the max is %i\n",my_id,max);
      printf("I am process %i and the min is %i\n",my_id,min);
      printf("I am process %i and the bitwiseAdd is %i\n",my_id,bitwiseAnd);
   }

   MPI_Finalize();
}

This produces output like this:

[blah@blah example]$ mpirun -np 5 all
Process #2: Here is num: 21
Okay it entered 2
Process #4: Here is num: 52
Okay it entered 4
Process #0: Here is num: 83
Okay it entered 0
Process #1: Here is num: 60
Okay it entered 1
Process #3: Here is num: 66
Okay it entered 3
I am process 0 and the sum is 282
I am process 0 and the product is 0
I am process 0 and the max is 0
I am process 0 and the min is 0
I am process 0 and the bitwiseAdd is 0
[blah@blah example]$

Why doesn't process 0 pick up the MPI_Reduce results from the other processes?

zwol
  • 135,547
  • 38
  • 252
  • 361
Chisx
  • 1,976
  • 4
  • 25
  • 53
  • This is a stab in the dark, but is it possible that it *is* waiting, but each process has its own copy of the output variables and so process 0 can't see the results from all the other ones? – zwol Apr 04 '16 at 00:24
  • But if it were waiting then why would the print statements print after the print off of results? – Chisx Apr 04 '16 at 00:29
  • 1
    I believe that's stdio buffering confusing the issue. Those are the last things that processes 1-4 print, and you didn't put a \n at the end of those strings (which would have triggered a flush, since output is to a terminal) so they do not get printed until processes 1-4 *exit* (which does an implicit flush). – zwol Apr 04 '16 at 00:31
  • Ahh. That makes sense. I didn't think about that. From documentation (what little there is http://mpitutorial.com/tutorials/mpi-reduce-and-allreduce/), it seems as though MPI_Reduce should simply store results in global variable passed as param. Im assuming my variables aren't global and you are correct about the seperate copies? must globals go above main..? – Chisx Apr 04 '16 at 00:33
  • You are correct about the buffer flush. Adding in "\n" on those printf's make them print before the results are printed, and if I print `product` from `my_id == 1`, it's still equal to 0. So it's never setting the other variables with the Reduce call – Chisx Apr 04 '16 at 00:39
  • Global variables must be declared *outside of all functions*, which is not quite the same as "above main". However, when I move your result variables outside of `main`, that doesn't fix the problem, and I'm afraid I don't know what else to change. – zwol Apr 04 '16 at 00:52
  • Yes I tried moving it outside of main, definitely didn't fix the issue either. It just doesn't seem to make sense to me that it can calculate the sum correctly and get it back to the root process, but all other calc's fail. – Chisx Apr 04 '16 at 00:54
  • The sum is being computed *in* the root process... – zwol Apr 04 '16 at 00:59
  • Oh, that's right. So I guess I'm either doing something wrong or the other processes aren't doing anything right. – Chisx Apr 04 '16 at 01:06
  • Thanks, I'm a bit clueless to the problem myself. And most definitely very novice at MPI. – Chisx Apr 04 '16 at 01:15

2 Answers2

2

I figured out what's wrong with your program by experimentation, and based on that, I have a hypothesis as to why it's wrong.

This modified version of your program does what you expected it to do:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <mpi.h>

int main (int argc, char **argv)
{
   int my_id;
   int num_procs;
   int num;
   int sum = 0;
   int product = 0;
   int max = 0;
   int min = 0;
   int bitwiseAnd = 0;
   int seed = time(0);

   MPI_Init(&argc, &argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
   MPI_Comm_size(MPI_COMM_WORLD, &num_procs);

   srand(seed * my_id);
   num = rand() % 100;

   printf("Process #%i: Here is num: %i\n",my_id,num);

   MPI_Reduce(&num, &sum,        1, MPI_INT, MPI_SUM,  0, MPI_COMM_WORLD);
   MPI_Reduce(&num, &product,    1, MPI_INT, MPI_PROD, 0, MPI_COMM_WORLD);
   MPI_Reduce(&num, &max,        1, MPI_INT, MPI_MAX,  0, MPI_COMM_WORLD);
   MPI_Reduce(&num, &min,        1, MPI_INT, MPI_MIN,  0, MPI_COMM_WORLD);
   MPI_Reduce(&num, &bitwiseAnd, 1, MPI_INT, MPI_BAND, 0, MPI_COMM_WORLD);

   MPI_Barrier(MPI_COMM_WORLD);

   if (my_id == 0) {
      printf("The sum is %i\n", sum);
      printf("The product is %i\n", product);
      printf("The max is %i\n", max);
      printf("The min is %i\n", min);
      printf("The bitwiseAnd is %i\n", bitwiseAnd);
   }

   MPI_Finalize();
   return 0;
}

Many of the changes I made are just cosmetic. The change that makes the difference is, all processes must execute all of the MPI_Reduce calls in order for all of the results to be computed.

Now, why does that matter? I must emphasize that this is a hypothesis. I do not know. But an explanation that fits the available facts is: in both my and your implementation of MPI, the actual computation in an MPI_Reduce call happens only on the root process, but all the other processes must also call MPI_Reduce in order to send a message with their values. That message doesn't depend on the operation argument. So the MPI_SUM call did what it was supposed to do by accident, because the other calls to MPI_Reduce provided the values it needed. But none of the other calls did any computation at all.

If my hypothesis is correct, you're going to need to structure your program quite a bit differently if you want to have each computation carried out in a different process. Abstractly, you want an all-to-all broadcast so that all processes have all the numbers, then local computation of sum, product, etc., then all-to-one send the values back to the root. If I'm reading http://mpitutorial.com/tutorials/mpi-scatter-gather-and-allgather/#mpi_allgather-and-modification-of-average-program correctly, MPI_Allgather is the name of the function that does all-to-all broadcasts.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • The MPI_Reduce method should return the result to the root process as far as I understand from http://mpitutorial.com/tutorials/mpi-reduce-and-allreduce/. As to the actual implementation it did get confusing. Thank you for your solution! This does indeed work. – Chisx Apr 04 '16 at 01:39
  • Part of your hypothesis *"you have to call `MPI_Reduce` on all ranks"* is correct, but it is not correct that *"the actual computation [...] happens only on the root process"*. Please see my answer for details and standard citation. – Zulan Apr 04 '16 at 10:42
2

The answer from zwol is basically correct, but I would like to reassure his hypothesis:

MPI_Reduce is a collective operation, it has to be called by all members of the communicator argument. In case of MPI_COMM_WORLD this means all initial ranks in the application.

The MPI standard (5.9.1) is also helpful here:

The routine is called by all group members using the same arguments for count, datatype, op, root and comm. Thus, all processes provide input buffers of the same length [...]

It is important to understand, that the root is not the one doing all the computations. The operation is done in a distributed fashion, usually using a tree algorithm. This means only a logarithmic amount of time steps have to be performed and is much more efficient than just collecting all data to the root and performing the operation there, especially for large amount of ranks.

So if you want the result at rank 0, you indeed have to run the code unconditionally like this:

MPI_Reduce(&num, &sum,        1, MPI_INT, MPI_SUM,  0, MPI_COMM_WORLD);
MPI_Reduce(&num, &product,    1, MPI_INT, MPI_PROD, 0, MPI_COMM_WORLD);
MPI_Reduce(&num, &max,        1, MPI_INT, MPI_MAX,  0, MPI_COMM_WORLD);
MPI_Reduce(&num, &min,        1, MPI_INT, MPI_MIN,  0, MPI_COMM_WORLD);
MPI_Reduce(&num, &bitwiseAnd, 1, MPI_INT, MPI_BAND, 0, MPI_COMM_WORLD);

If you need the result at different ranks, you can change the root parameter accordingly. If you want the result to be available at all ranks, use MPI_Allreduce instead.

Zulan
  • 21,896
  • 6
  • 49
  • 109
  • If I understand what you're saying, code like what the OP originally had would start miscomputing the MPI_SUM, as well as not computing the rest of the aggregates at all, once the fan-in was large enough that the MPI library decided to start doing partial computations on non-root nodes (and therefore started paying attention to the opcode argument to MPI_Reduce on non-root nodes). Is it so? – zwol Apr 04 '16 at 15:34
  • @zwol, In practice, that is a reasonable explanation for the observed behavior. However, as soon as you violate MPI assumptions everything becomes undefined behavior. It may crash, hang forever, compute the wrong thing, compute the right thing. That can depend on the implementation and any arbitrary environmental factors. – Zulan Apr 04 '16 at 17:00