1

I'm an mpi newbie. I'm trying to parallelize my code with mpi (need to run some experiments faster). It should work like this: master sends an array of strings to the slaves, they do some job and send status_ready back to the master. When all slaves are ready, master goes into a loop and iterativelly sends a vector of doubles to the slaves, slaves process this vector and send their results (2 vectors) back to the master. When all tje messages are received, master will process it and the loop iterates (master sends the results to the slaves, etc.) It should work like this

#include <iostream>
#include <mpi.h>
#include <cmath>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>

using namespace std;
using namespace cv;
//int mod(int z, int l);

int xGradient(Mat image, int x, int y)
{
  return image.at<uchar>(y-1, x-1) +
    2*image.at<uchar>(y, x-1) +
    image.at<uchar>(y+1, x-1) -
    image.at<uchar>(y-1, x+1) -
    2*image.at<uchar>(y, x+1) -
    image.at<uchar>(y+1, x+1);
}

int yGradient(Mat image, int x, int y)
{
  return image.at<uchar>(y-1, x-1) +
    2*image.at<uchar>(y-1, x) +
    image.at<uchar>(y-1, x+1) -
    image.at<uchar>(y+1, x-1) -
    2*image.at<uchar>(y+1, x) -
    image.at<uchar>(y+1, x+1);
}

int main()
{
  Mat src, grey, dst;
  double start, end;
  int i, gx, gy, sum, argc, awal,akhir, size, rank, slave;
  int master=0;
  char **argv;
  // MPI_Status status;
  awal= MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  slave=size-1;
  start=MPI_Wtime();
  if( rank == master )
    {
      // start=MPI_Wtime();
      src= imread("E:/tigaout/Debug/jari.jpg");  
      cvtColor(src,grey,CV_BGR2GRAY);

      //MPI_Send(&(row_pointers[i*share+done][0]), 1, newtype, i, 1, MPI_COMM_WORLD);
      dst = grey.clone();
      if( !grey.data )
        {
          return -1;
        }
      for (i=1; i<slave; i++)
        {
          MPI_Send(&dst, 1, MPI_DOUBLE, i, 1, MPI_COMM_WORLD);
          cout<<"master mengirim data ke rank 1"<<dst<<endl;
        }
    }
  MPI_Barrier(MPI_COMM_WORLD);
  if (rank != master)
    {
      MPI_Recv(&dst, 1, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
      cout<<"rank 1 menerima data"<<dst<<endl;
    }
  for(int y = 0; y < grey.rows; y++)
    for(int x = 0; x < grey.cols; x++)
      dst.at<uchar>(y,x) = 0;  

  for(int y = 1; y < grey.rows - 1; y++)
    {
      for(int x = 1; x < grey.cols - 1; x++)
        {
          gx = xGradient(grey, x, y);
          gy = yGradient(grey, x, y);
          sum = abs(gx) + abs(gy);
          sum = sum > 255 ? 255:sum;
          sum = sum < 0 ? 0 : sum;
          dst.at<uchar>(y,x) = sum;


        }
    }


  /*    namedWindow("deteksi tepi sobel");
            imshow("deteksi tepi sobel", dst);

            namedWindow("grayscale");
            imshow("grayscale", grey);

            namedWindow("Original");
            imshow("Original", src);*/

  imwrite( "E:/tigaout/Debug/deteksi jari.jpg", dst );
  MPI_Barrier(MPI_COMM_WORLD);
  end=MPI_Wtime();
  cout<<"waktu eksekusi adalah: "<< end-start << " detik " <<endl;
  akhir=MPI_Finalize();

  //waitKey();

  return 0;
}

I already try make this code using mpi point-to-point send/receive. but my code always wrong, where is my wrong, How do I fix this?

Phil Miller
  • 36,389
  • 13
  • 67
  • 90
Monic92
  • 21
  • 5
  • "my code always wrong" doesn't tell us anything. What behavior are you expecting and what behavior does your code exhibit? – Captain Obvlious Dec 24 '13 at 06:11
  • 1
    `if(rank==master){MPI_Send();}MPI_Barrier();if(rank!=master){MPI_Recv()}` is a deadlock. Process `master` will wait until message are received and the MPI_Barrier() will never be completed. Try to remove the MPI_Barrier() ! More : copying dest from `master` to all other process is called Broadcast : see MPI_Bcast() http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Bcast.html – francis Dec 24 '13 at 09:23
  • Your first call to `MPI_Barrier` is a deadlock. Remove it, run your code, and tell us what goes wrong. – Phil Miller Jan 03 '14 at 18:58
  • Your computation of the Sobel operator is not optimized. Would you be satisfied with a solution that is 10x faster and still running in one thread? – morotspaj Jan 03 '14 at 20:26

1 Answers1

0

You'll probably get better performance and certainly simpler code by using collectives. The first step in which your slaves send data to the master is equivalent to MPI_Gather. The step when the master sends out new data to each of the slaves is an MPI_Scatter.

I think the conceptual piece that may be causing problems in your attempts so far is that MPI programs use a single program, multiple data programming model. Every rank is executing the same code but just gets different values of "rank". That seems to be understood in your blocks if (rank == master) and if (rank != master), but when using a barrier or other collective operation you have to keep in mind that no ranks in the communicator you pass to it will pass that point in the code until all the rest get there. The calls you're making to MPI_Send are blocking, so the master rank probably won't pass the first send until after the receiving rank posts a matching MPI_Recv, i.e. never because the receiving rank is stuck on the barrier.

Hope that helps.

Aaron Altman
  • 1,705
  • 1
  • 14
  • 22
  • http://stackoverflow.com/questions/13867809/how-are-mpi-scatter-and-mpi-gather-used-from-c – Aaron Altman Jan 07 '14 at 16:06
  • Please upvote and mark accepted if that helps, or continue to ask for clarification if there are any specific points out of these links or related material that aren't making sense. – Aaron Altman Jan 07 '14 at 16:40