0

I'm writing a C++ program that uses OpenMPI. It executes in "rounds", where in each round, process 0 sends chunks of data to the other processes, they do stuff to it and send results back, and when there are no more chunks to send, process 0 sends a "done" message to each other process. A "done" message is just a single-int message with tag 3. My first round executes fine. However, when I get to round two, processes 1-p "probe" and "receive" a done message before process 0 has had a chance to send anything (let alone a done message).

I've gone over my code many times now and it seems like the only place this message could be coming from is where process 0 sent it in the previous round - but each process had already received that. I'd rather not post my code since it's pretty big, but does anyone know if MPI messages can be received twice like this?

FrancesKR
  • 1,200
  • 1
  • 12
  • 27

1 Answers1

1

I think I may have the answer... Since the actual data in the done message doesn't matter, I didn't think to have the processes actually receive it. It turns out that in the previous round, the processes were "probing" the message and finding that the tag was 3, then breaking out of their loop. Therefore, in round two, the message was still waiting to be received, so when they called MPI_Probe, they found the same message as in the previous round.

To solve this I just put in a call to MPI_Recv. I looked at MPI_Cancel but I can't find enough information about it to see if it would be appropriate. Sorry for being misleading in my question!

FrancesKR
  • 1,200
  • 1
  • 12
  • 27
  • Also note: if the information in a message is purely held within it's tag (such as a 'done' message), you can send a message with count = 0. – ricky116 Aug 18 '13 at 17:41
  • @ricky116 Would I still have to receive it in that case? – FrancesKR Aug 18 '13 at 17:47
  • If you are sending an `ISend`, it'll need to be handled with a receiving operation on the target node, so yes, whilst the information you want will be fully available through an `IProbe`, you still need to perform the `Recv` on a zero-sized message. – ricky116 Aug 18 '13 at 18:19