0

I am using a GPU cluster without GPUDirect support. From this briefing, the following is done when transferring GPU data across nodes:

  1. GPU writes to pinned sysmem1
  2. CPU copies from sysmem1 to sysmem2
  3. Infiniband driver copies from sysmem2

Now I am not sure whether the second step is an implicit step when I transfer sysmem1 across Infiniband using MPI. By assuming this, my current programming model is something like this:

  1. cudaMemcpy(hostmem, devicemem, size, cudaMemcpyDeviceToHost).
  2. MPI_Send(hostmem,...)

Is my above assumption true and will my programming model work without causing communication issues?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
Hailiang Zhang
  • 17,604
  • 23
  • 71
  • 117
  • Depending on the MPI implementation...it may be possible to force all messages (regardless of size) to use the RDMA eager protocol. In the eager protocol, the MPI library will copy the GPU "sysmem1" into a pre-pinned buffer "sysmem2" for the RDMA transfer. This technique can be helpful for applications that can not be modified. The amount of pre-pinned memory for best performance in this case can be quite large. – Stan Graves Sep 20 '13 at 19:28

1 Answers1

3

Yes, you can use CUDA and MPI independently (i.e. without GPUDirect), just as you describe.

  1. Move the data from device to host
  2. Transfer the data as you ordinarily would, using MPI

You might be interested in this presentation, which explains CUDA-aware MPI, and gives an example side-by-side on slide 11 of non-cuda MPI and CUDA-MPI

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257