cudaMallocHost / cudaHostAlloc on multi GPU

Question

In CUDA docs, specifically in CUDA Runtime API in section Device Management about cudaSetDevice, it is written like this

Any host memory allocated from this host thread using cudaMallocHost() or cudaHostAlloc() or cudaHostRegister() will have its lifetime associated with device

So my question is here: If I allocate a host memory using cudaHostAlloc with current device is dev 0, and then I transfer that host memory to device memory in dev 1, is there any limitation or problem?

If you have another question (and this is really a different question), please start a new question. Don't edit it into an already answered and accepted question. People are very unlikely to see the new edits because the question already has an accepted answer. — talonmies, Feb 08 '13 at 07:10

talonmies · Accepted Answer · 2013-02-07T10:38:13.333

In the "traditional" CUDA memory model, if you want to use a mapped host memory allocation in more than one context, you must allocate the memory with cudaHostAlloc() using the cudaHostAllocPortable flag. That will make the memory portable across all contexts.

If you are running on a platform with unified addressing support, then you shouldn't need to worry about it as long as you use cudaMemcpyDefault in any cudaMemcpy() operations on that memory.

cudaMallocHost / cudaHostAlloc on multi GPU

1 Answers1