4

I had the following code, which works well on a machine but I when I try to run it on another machine with better graphics card I am getting errors:

global[0] = 512; global[1] = 512;
local [0] = 16; local [1] = 16;
ciErrNum = clEnqueueNDRangeKernel(commandQueue, myKernel, 2, NULL, global, local, 0, NULL, &event);

Errors:

Error @ clEnqueueNDRangeKernel: CL_INVALID_KERNEL_ARGS
Error @ clWaitForEvents: CL_INVALID_KERNEL_ARGS

Any idea what is the problem?

Christian Rau
  • 45,360
  • 10
  • 108
  • 185
Avraam Mavridis
  • 8,698
  • 19
  • 79
  • 133
  • 1
    Uh...so...what are the kernel arguments (and the kernel itself)? – Christian Rau Dec 13 '13 at 09:15
  • @ChristianRau I finally found that keyword `__constant` that I used on the kernel work only on two of the three machines that I tried, I don't know why. When I changed it to `__global` on the kernel it works on the third. – Avraam Mavridis Dec 13 '13 at 11:21
  • 1
    Constant memory is a scarce and limited resource. Maybe your third device has very little constant memory. – DarkZeros Dec 13 '13 at 11:35

1 Answers1

6

How large are the buffer objects you are passing? __constant arguments are allocated from separate memory space and not from global memory so therefore you have probably ran out of constant memory.

The spec mandates that in full profile the device must support at minimum 4 __constant arguments with 64kB total in size. In embedded profile it's dropped to 1kB.

You can query the amount of constant memory available by checking the CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE using clGetDeviceInfo. Most likely your devices support way more than this minimum requirement.

In most cases you should use the constant buffer if you are able as in general it is quite a lot faster than global memory.

In future you should provide more information on your question. Because if the error is CL_INVALID_KERNEL_ARGS one really needs to know what are the arguments to your kernel.

Christian Rau
  • 45,360
  • 10
  • 108
  • 185
sharpneli
  • 1,601
  • 10
  • 9
  • One amend: The minimum amount of constant memory is 16kB in the full spec (1.0 and 1.1). The version 1.2 states 64kB, but it is not available for all vendors. – DarkZeros Dec 13 '13 at 13:25
  • 1
    You sure? Check http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf and it states 64kB. Also http://www.khronos.org/registry/cl/specs/opencl-1.0.pdf states the same. However I did make one mistake. The amount of supported constant arguments must be 8 at minimum. – sharpneli Dec 13 '13 at 13:32
  • Sry my mistake, what I said applies to "local" memory not "constant" memory. Thank you for x-checking. – DarkZeros Dec 14 '13 at 17:24
  • Wow, I had 2 days trying to figure out what was happening on a similar case... it was the 64kb limit for the constant memory (which I didn't know about). Thanks, @sharpneli – fegemo Dec 01 '17 at 00:44