1

I'm using MPI nonblocking messages to communicate between 2 tasks. The communication pattern is as follows: Each task has a master thread that receives messages from the other tasks. It has 5 or so work threads that do a computation and send messages to the other tasks. The master thread loops, testing for incoming messages. This is the only thing it does.

My problem is that while task 0 instantaneously receives everything sent from task 1 (number of messages sent and received roughly match), task 1 only receives about 1/4 of the messages sent by task 0. After running for a minute, there are hundreds of thousands of outstanding messages.

Using PAPI, I've determined that task 1 seems to block on test and irecv. The instruction throughput is only 0.03 instr/cycle as opposed to >0.2 for the other task, and stopping the task in the debugger shows that it is trying to acquire a lock. However, the receive and test that is blocking is not the ones for the "missing" messages but for another class of much rarer messages.

I realize it's hard to say what could cause this without actually trying the code, but I find it puzzling that there is such an asymmetry in the MPI performance. The task that can't keep up with the receives is not for lack of trying, it's really spending all its time testing for incoming messages.

I'm using OpenMPI 1.5.3 with MPI_THREAD_MULTIPLE, and the communication is over sm, (the two tasks are on the same node).

Any ideas how to track this down would be appreciated.

Dave Goodell
  • 2,143
  • 16
  • 18
Lutorm
  • 381
  • 2
  • 7
  • What is your CPU/motherboard? – osgx Nov 29 '11 at 23:17
  • I believe it's a Dell PowerEdge C6100 with 2 X5650 Xeons. – Lutorm Nov 29 '11 at 23:58
  • Are you oversubscribing the machine? That is, are you using more threads/processes than there are cores on the host? If so, MPI performance will likely be erratic with most (thought not all) MPI implementations. The fact that you are using multithreading also may lead to variable performance, depending on Open MPI's design. – Dave Goodell Dec 08 '11 at 20:45
  • @Dave: No. I'm aware of the oversubscription issue, but that's not what's going on here. I think I figured it out from a related discussion on the openmpi list: Once a task falls behind on receiving, each receive becomes more expensive because each receive now needs to be tested against the pending messages. This causes it to fall further behind in a vicious circle. I managed to work around the problem by just spending more time testing for these receives rather than testing for other things, so now it works correctly. – Lutorm Dec 08 '11 at 22:13

0 Answers0