0

I have written my code for single Xeon Phi node( with 61 cores on it). I have two files. I have called MPI_Init(2) before calling any other mpi calls. I have found ntasks, rank also using mpi calls. I have also included all the required libraries. Still i get an error. Can you please help me out with this?

In file 1:

 int    buffsize;
 int    *sendbuff,**recvbuff,buffsum;
 int *shareRegion;
 shareRegion = (int*)gInit(MPI_COMM_WORLD, buffsize, ntasks);   /* gInit is in file 2 */
 buffsize=atoi(argv[1]);
 sendbuff=(int *)malloc(sizeof(int)*buffsize);
 if( taskid == 0 ){
   recvbuff=(int **)malloc(sizeof(int *)*ntasks);
   recvbuff[0]=(int *)malloc(sizeof(int)*ntasks*buffsize);
   for(i=1;i<ntasks;i++)recvbuff[i]=recvbuff[i-1]+buffsize;
 }
 else{
   recvbuff=(int **)malloc(sizeof(int *)*1);
   recvbuff[0]=(int *)malloc(sizeof(int)*1);
 }

 for(i=0;i<buffsize;i++){
     sendbuff[i]=1;

 MPI_Barrier(MPI_COMM_WORLD);

 call(sendbuff, buffsize, shareRegion, recvbuff[0],buffsize,taskid,ntasks);

In file 2:

 void* gInit( MPI_Comm comm, int size, int num_proc)
 {
    int share_mem = shm_open("share_region", O_CREAT|O_RDWR,0666 );

    if( share_mem == -1)
     return NULL;
    int rank;
    MPI_Comm_rank(comm,&rank);

    if( ftruncate( share_mem, sizeof(int)*size*num_proc) == -1 )
       return NULL;

    int* shared =  mmap(NULL, sizeof(int)*size*num_proc, PROT_WRITE | PROT_READ,    MAP_SHARED, share_mem, 0);

    if(shared == (void*)-1)
       printf("error in mem allocation (mmap)\n");

    *(shared+(rank)) = 0

    MPI_Barrier(MPI_COMM_WORLD);

    return shared;
 }

 void call(int *sendbuff, int sendcount, volatile int *sharedRegion, int **recvbuff, int recvcount, int rank, int size)
 {
    int i=0;
    int k,j;
    j=rank*sendcount;
    for(i=0;i<sendcount;i++)
    {
      sharedRegion[j] = sendbuff[i];
      j++;
    }

    if( rank == 0)
      for(k=0;k<size;k++)
        for(i=0;i<sendcount;i++)
        {

           j=0;
           recvbuff[k][i] = sharedRegion[j];
           j++;

        }
 }

Then i am doing some computation in file 1 on this recvbuff. I get this segmentation fault while using sharedRegion variable.

Boppity Bop
  • 9,613
  • 13
  • 72
  • 151

3 Answers3

1

MPI represents the Message Passing paradigm. That means, processes (ranks) are isolated and are generally running on a distributed machine. They communicate via explicit communication messages, recent versions allow also one-sideded, but still explicit, data transfer. You can not assume that shared memory is available for the processes. Have a look at any MPI tutorial to see how MPI is used.

Since you did not specify on what kind of machine you are running, any further suggestion is purely speculative. If you actually are on a shared memory machine, you may want to use a real shared memory paradigm instead, e.g. OpenMP.

Zulan
  • 21,896
  • 6
  • 49
  • 109
0

While it's possible to restrict MPI to only use one machine and have shared memory (see the RMA chapter, especially in MPI-3), if you're only ever going to use one machine, it's easier to use some other paradigm.

However, if you're going to use multiple nodes and have multiple ranks on one node (multi-core processes for example), then it might be worth taking a look at MPI-3 RMA to see how it can help you with both locally shared memory and remote memory access. There are multiple papers out on the subject, but because they're so new, there's not a lot of good tutorials yet. You'll have to dig around a bit to find something useful to you.

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
0

The ordering of these two lines:

shareRegion = (int*)gInit(MPI_COMM_WORLD, buffsize, ntasks);   /* gInit is in file 2 */
buffsize=atoi(argv[1]);

suggest that buffsize could possibly have different values before and after the call to gInit. If buffsize as passed in the first argument to the program is larger than its initial value while gInit is called, then out-of-bounds memory access would occur later and lead to a segmentation fault.

Hint: run your code as an MPI singleton (e.g. without mpirun) from inside a debugger (e.g. gdb) or change the limits so that cores would get dumped on error (e.g. with ulimit -c unlimited) and then examine the core file(s) with the debugger. Compiling with debug information (e.g. adding -g to the compiler options) helps a lot in such cases.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186