15

I'm using OpenCL on an nvidia GPU and I keep getting CL_INVALID_KERNEL_ARGS when I try to execute a kernel. I've stepped it down to a very simple program:

__kernel void foo(int a, __write_only image2d_t bar)
{
  int 2 coords = {0, get_global_id(0)};
  write_imagef(bar, coords, (float4)a);
}

With the following C program (skipped initialization and error checking bits for brevity)

cl_kernel foo = clCreateKernel(program, "foo", &err);
int a = 42;
clSetKernelArg(foo, 0, sizeof(int), &a);

cl_image_format fmt = {CL_INTENSITY, CL_FLOAT};
cl_mem bar = clCreateImage2D(ctx, CL_MEM_WRITE_ONLY|CL_MEM_ALLOC_HOST_PTR, &fmt, 100, 1, 0, NULL, &err));
clSetKernelArg(foo, 1, sizeof(cl_mem), &bar);

size_t gws[] = {100};
size_t lws[] = {100};
cl_event evt;
clEnqueueNDRangeKernel(queue, foo, 1, NULL, gws, lws, 0, NULL, &evt);
clFinish(queue);

The clEnqueueNDRangeKernel keeps returning CL_INVALID_KERNEL_ARGS. Any ideas?

Trevor
  • 1,369
  • 2
  • 13
  • 28
  • 1
    Shouldn't your `clSetKernelArg` calls be setting `kern` instead of `foo`? – KLee1 Nov 08 '12 at 22:23
  • Also the fourth argument of `clEnqueueNDRangeKernel` (global_work_offset) must be NULL according to the spec, but you are passing `gwo`, _a pointer to a NULL value_. – James Beilby Nov 08 '12 at 22:56
  • KLee1 - Sorry, that's a transcription error, I've fixed it. – Trevor Nov 09 '12 at 13:49
  • James - I changed that but it had no bearing on the error. Changed it in the sample. – Trevor Nov 09 '12 at 13:52
  • I'm always casting the arg_value to (void *) inside clSetKernelArg(). Try that perhaps. – Paul Irofti Nov 09 '12 at 16:54
  • This looks fine to me now, apart from an extra bracket at the end of clCreateImage2D... Can you provide the actual code that you are running? Have you tried breaking down the problem, e.g. removing the first kernel argument, removing CL_MEM_ALLOC_HOST_PTR? – James Beilby Nov 09 '12 at 18:34
  • Hey Trevor, can you check the Local work size for your device? It is usually a power of 2, and can be unspecified if needed. I would try NULL for the local work size. – Austin Jun 25 '17 at 14:49

2 Answers2

5

See https://stackoverflow.com/a/20566270/431528.

How large are the buffer objects you are passing? __constant arguments are allocated from separate memory space and not from global memory so therefore you have probably ran out of constant memory

Check CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE using clGetDeviceInfo to ensure you are not exceeding that size.

hiddensunset4
  • 5,825
  • 3
  • 39
  • 61
4

You are trying to pass a variable on host to kernel. You need to create a cl_mem variable and then copy the value using clEnqueueWriteBuffer, and then pass the cl_mem or cl_int variable to kernel. Other than that your code looks fine to me.

Quonux
  • 2,975
  • 1
  • 24
  • 32
Younes Nj
  • 566
  • 8
  • 16