1

I am wondering why the R3 protocol shows great performance, when using multiple different buffers which exhaust the registration cache. Does it not need to pin and unpin the buffers provided for sending or how does it hide this overhead? Is it always a good choice to stick to the R3 protocol?

On the bottom you see a diagram showing my observation. I used 2 nodes sending and receiving in parallel. The x-axis denotes the numbers of buffers n (each 1MB) used for sending. The main loop looks like this:

\\ Take time
for(i to 20){
  for(a to n) IRecv(rec_buffer[a])
  for(a to n) ISend(send_buffer[a])
  waitForAllRecv()
  waitForAllSend()
}
\\ Plot time

See plot at: http://i47.tinypic.com/2vkn6ty.jpg

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
  • I am not an InfiniBand expert and definitely not a user of MVAPICH2, but with many outstanding operations you are probably exhausting the completion queues. There should be a parameter to control the CQ size for RDMA-based protocols. – Hristo Iliev Apr 04 '13 at 06:58
  • I would say this is unlikely, as I can move the drop by changing the size of the registration cache. So the reason for the decline seems to be the pinning and unpinning process of pages – Stephan Wolf Apr 05 '13 at 18:44

0 Answers0