OpenCL inter-context buffer aliasing

Question

Suppose I have 2 OpenCL-capable devices on my machine (not including CPUs); and suppose that an evil colleague of mine creates a different context for each of them, which I have to work with.

I know I can't share buffers between contexts - not properly and officially, at least. But suppose that I create two OpenCL buffers, one in each context, and pass to each of them the same region of host memory, with the CL_MEM_USE_HOST_PTR flag. e.g.:

enum { size = 1234 };
//...
context_1 = clCreateContext(NULL, 1, &some_device_id, NULL, NULL, NULL);
context_2 = clCreateContext(NULL, 1, &another_device_id, NULL, NULL, NULL);

void* host_mem = malloc(size);
assert(host_mem != NULL);
buff_1 = clCreateBuffer(context_1, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,  size, host_mem, NULL);
buff_2 = clCreateBuffer(context_2, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,  size, host_mem, NULL);

I realize that, officially,

The result of OpenCL commands that operate on multiple buffer objects created with the same host_ptr or overlapping host regions is considered to be undefined.

But what will actually happen if I copy to this buffer from one device, and from this buffer to another device? I'm specifically interested in the case of (relatively-recent) AMD and NVIDIA GPUs.

score 1 · Answer 1 · answered Jan 09 '20 at 13:33

1

If your OpenCL implementation's vendor guarantees some kind of specific behaviour that goes beyond the standard, then go with that and make sure to follow any instructions about limitations to the letter.

If it doesn't, then you have to assume what the standard says.

answered Jan 09 '20 at 13:33

pmdj

22,018
3
52
103

I was hoping to rely on people's experience on this matter... – einpoklum Jan 09 '20 at 13:54
1

Seems an extremely risky strategy to rely on hearsay about this sort of thing. Get your vendors' guarantees on record, otherwise assume the worst. Would you write code that triggered undefined behaviour as per the C or C++ standard if someone told you that relying on signed integer overflow had worked for them in the past? (Better still, fix your existing code to use an OpenCL context you create, so you can share buffers.) – pmdj Jan 09 '20 at 13:55
I wouldn't put it in spaceship control code due for release next week; but I would certainly spend some time playing with it while I wait for the split-context bug report to assigned and handled. – einpoklum Jan 09 '20 at 15:46

score 0 · Answer 2 · answered Jan 14 '20 at 13:26

I know I can't share buffers between contexts

It's not the contexts that are the problem. It's platforms. There are essentially two cases:

1) you want to share buffers between devices from the same platform. In that case, simply create a single context with all devices, don't complicate your life, and let the platform handle it.

2) you need to share buffer between devices from different platforms. In that case, you're on your own.

waiting "for the split-context bug report to assigned and handled" isn't going to get you anywhere, because if it's contexts from same platform they'll tell you what i said in 1), and if it's contexts from different platforms they'll tell you it's impossible to support in any sane way.

"what will actually happen" ... depends (on a gajillion things). Some platforms will try to map the memory pointer (if it's properly aligned, for some definition of "properly") to the device address space. Some platforms will just silently copy it to device memory. Some platforms will also update the contents of the host memory after every enqueued command (which could mean a huge slowdown), while others will only update it at some specific "synchronization points".

My personal experience is to avoid CL_MEM_USE_HOST_PTR unless i know i'm working with iGPU or a CPU implementation (and have properly aligned pointers).

If you have AMD and NVIDIA gpus in the same machine, i'm not aware of any official way they can share buffers efficiently, which means you'll have to go through host memory anyway... in which case i'd avoid any games with CL_MEM_USE_HOST_PTR and just rely on clMap/Unmap or clRead/Write.

You've misread my question. You wrote "simply create a single context with all devices" - and I would do this of course, but: "an evil colleague of mine creates a different context for each of them, which I have to work with." — einpoklum, Jan 14 '20 at 14:41

OpenCL inter-context buffer aliasing

2 Answers2