1

This is what I am trying to achieve.

Blue is the message.
Yellow is when the specific node changes the leader known to it.
Green is the final election of each node.

enter image description here

The code seems correct to me but it's always stuck inside the while loop no matter what I tried. For a small number of nodes during runtime it returns a segmentation fault after a while.

election_status=0;
firstmsg[0]=world_rank;     // self rank
firstmsg[1]=0;              // counter for hops
chief=world_rank;           // each node declares himself as leader
counter=0;                  // message counter for each node

// each node sends the first message to the next one
MPI_Send(&firstmsg, 2, MPI_INT, (world_rank+1)%world_size, 1, MPI_COMM_WORLD);
printf("Sent ID with counter to the right node [%d -> %d]\n",world_rank, (world_rank+1)%world_size);

while (election_status==0){
    // EDIT: Split MPI_Recv for rank 0 and rest
    if (world_rank==0){
        MPI_Recv(&incoming, 2, MPI_INT, world_size-1, 1, MPI_COMM_WORLD, &status);
    }
    else {
        MPI_Recv(&incoming, 2, MPI_INT, (world_rank-1)%world_size, 1, MPI_COMM_WORLD, &status);
    }
    counter=counter+1;
    if (incoming[0]<chief){
        chief=incoming[0];
    }
    incoming[1]=incoming[1]+1;

    // if message is my own and hopped same times as counter
    if (incoming[0]==world_rank && incoming[1]==counter) {
        printf("Node %d declares node %d a leader.\n", world_rank, chief);  
        election_status=1;
    }
    //sends the incremented message to the next node
    MPI_Send(&incoming, 2, MPI_INT, (world_rank+1)%world_size, 1, MPI_COMM_WORLD);  
}

MPI_Finalize();
quelotic
  • 64
  • 10
  • 1
    If you can't go out of the while loop, it means that you never go into the second if( ). So you need to understand why. – Gam Jun 07 '17 at 12:01
  • it seems curious to me, the counter and incoming[1] are being incremented without a stop. Why is that when theoretically there should be a continuation on the commands executed by each node? Shouldn't the counter=counter+1 command be executed for each node, after the node received a message? – quelotic Jun 07 '17 at 12:22
  • 1
    Hey, your `(world_rank-1)` for `MPI_Recv` looks not very good. Think about root with `rank=0`. – stas.yaranov Jun 07 '17 at 14:14
  • Thanks! That was the problem, I split the receive into 2 ifs and it's working! – quelotic Jun 07 '17 at 14:38

1 Answers1

1

In order to determine some minimum number among a number of ranks for all ranks, use MPI_Allreduce!

  • MPI_Send is blocking. It can block forever until a matching receive is posted. Your program deadlocks on the first call to MPI_Send - and any successive once should it complete by coincidence. To avoid that specifically use MPI_Sendrecv.
  • (world_rank-1)%world_size will produce -1 for world_rank == 0. Using -1 as rank number is not valid. It might coincidentially be MPI_ANY_SOURCE.
Zulan
  • 21,896
  • 6
  • 49
  • 109
  • I split the receive into 2 ifs. It's working beautifully... Thanks! – quelotic Jun 07 '17 at 14:22
  • Your program is still incorrect! Do not rely on the nonblocking behavior that `MPI_Send` sometimes exhibits. Use `MPI_Sendrecv` for a portable conforming program. – Zulan Jun 07 '17 at 14:45