1

I wish to allocate a vector and use it's data pointer to allocate a zero copy buffer on the GPU. There is this cl_arm_import_memory extension which can be used to do this. But I am not sure wether its supported for all mali midgard OpenCL drivers or not.

I was going through this link and I am quite puzzled by the following lines : -

If the extension string cl_arm_import_memory_host is exposed then importing from normal userspace allocations (such as those created via malloc) is supported.

What exactly does these lines mean ? I am specifically working on rockchip's RK3399 boards. Kindly help.

1 Answers1

0

If the extension string cl_arm_import_memory_host is exposed

This means you need to check the CL_DEVICE_EXTENSIONS property of your OpenCL device using the clGetDeviceInfo() function. Split the returned string into extension names (they are separated by spaces) then check if "cl_arm_import_memory_host" is one of those strings.

Note that the extension in question consists of multiple different sub-features:

cl_arm_import_memory
cl_arm_import_memory_host
cl_arm_import_memory_dma_buf
cl_arm_import_memory_protected

cl_arm_import_memory will be reported if at least one of the other extension strings is also reported.

So if your implementation supports importing host memory it will list both cl_arm_import_memory and cl_arm_import_memory_host.

If the correct feature is supported, you will probably need to get a pointer to the extension's clImportMemoryARM() function by calling clGetExtensionFunctionAddressForPlatform.

Then, use the extension's features as documented.

pmdj
  • 22,018
  • 3
  • 52
  • 103
  • Hey, Thanks for the reply. I checked the CL_DEVICE_EXTENSIONS property and the returned string consisted of "cl_arm_import_memory" but not cl_arm_import_memory_host". What does this mean ? Can I use host side allocated buffers and pass it to the ImportArmMemory Function ? –  Oct 21 '19 at 04:57
  • @abhiverma It looks like you need to test for the `cl_arm_import_memory_host` sub-feature specifically. I've updated my answer with the critical quote from the extension: `cl_arm_import_memory` just means that at least *one* of the sub-features is supported, but you need to additionally check for the specific one you are after. – pmdj Oct 21 '19 at 10:44
  • So, pmdj, How to make the test for cl_arm_import_memory_host sub feature specifically? Should I be implementing the function and validate the results or is there any other way ? –  Oct 21 '19 at 10:56
  • As I said, you need to check if `"cl_arm_import_memory_host"` is listed in the `CL_DEVICE_EXTENSIONS` property. If it is not, that part of the extension is not supported. At least that is my reading of the extension specification. – pmdj Oct 21 '19 at 10:58
  • if its showing cl_arm_import_memory, then from what I could understand , it should support at least one of the other sub-features but when I query the device info , none of the other sub-features are mentioned. –  Oct 21 '19 at 10:58
  • OK, that wasn't clear before, you just mentioned that `"cl_arm_import_memory_host"` is not listed. It sounds like you've got a faulty OpenCL driver in some way. If you can get an updated driver, I'd try that first. Otherwise, you could try to get the function pointer; if that fails, you obviously can't call the function. If it succeeds, you could try using it for importing host memory, and make sure you check the return value. If it succeeds and works, great. If not, you're out of luck. – pmdj Oct 21 '19 at 11:01
  • Note that standard OpenCL also supports the `CL_MEM_USE_HOST_PTR` flag to `clCreateBuffer()`, which on many implementations will suffice for zero-copy operation. There is however no *guarantee* that this will create a zero-copy buffer. – pmdj Oct 21 '19 at 11:02
  • This is the total list of extensions supported by my device –  Oct 21 '19 at 11:03
  • cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_khr_image2d_from_buffer cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory –  Oct 21 '19 at 11:03
  • As the extension in question is listed last: are you sure you provided a large enough buffer to `clGetDeviceInfo()`? Note that the required size is returned via the `param_value_size_ret` parameter, so if your `param_value_size` is less than what was returned in `param_value_size_ret`, your code needs to allocate a larger buffer and try again. – pmdj Oct 21 '19 at 11:05
  • Can't I just include "CL/cl_ext.h" file in my driver program and check for the function ? or should I be directly going for the function pointer. No on my device, USE_HOST_PTR doesn't give the desired behaviour. –  Oct 21 '19 at 11:07
  • The buffer size is not an issue, I am using c++ headers and just passing a std::string to query the device info. –  Oct 21 '19 at 11:08
  • I guess testing for whether `clImportMemoryARM` exists according to the weak linking method for your platform as it's exposed in the `cl_ext.h` header should work, yes. – pmdj Oct 21 '19 at 11:10