3

I'm working on an OpenCL program, but the output is different each execution. I think it has to do with passing arguments to the kernel, because when I hardcode the values for a specific execution, the output is similar after each execution.

My kernel looks like this:

__kernel void sample_kernel(__global double *BufferA, int scalar1, int scalar2, int scalar3, ...) {

    for(int i = -1*scalar1; i < scalar1; i++) {
        for(int j = -1*scalar1; j < scalar1, j++) {
            if(scalar2 > 0 && scalar3 > 0) // do something.
        }
    }
}

And this is how I set the kernel arguments:

int scalar1 = 1;
int scalar2 = 2;
int scalar3 = 3;

Samplekernel.setArg(0, d_BufferA);
Samplekernel.setArg(1, sizeof(int), &scalar1);
Samplekernel.setArg(2, sizeof(int), &scalar2);
Samplekernel.setArg(3, sizeof(int), &scalar3);

The weird thing is that when I add...

if(scalar1 != 1) scalar1 = 1;
if(scalar2 != 2) scalar2 = 2;
if(scalar3 != 3) scalar3 = 3;

...in the kernel before the double for-loop, the output is correct.

I run my program on an Nvidia K20m GPU, OpenCL version 1.1. When I run my code on an Nvidia C2075 everything seems to work fine...

Does anybody have an idea what the problem might be? It looks like the value is not copied correctly or that is overwritten, but I don't access that value before the for-loops...

Thanks in advance!

Balanidhren
  • 31
  • 1
  • 2

1 Answers1

2

I looks like you are passing a pointer to an int to setArg

Samplekernel.setArg(1, sizeof(int), &scalar1);

and then, in you kernel paramater list, you have values of ints, not pointers:

__kernel void sample_kernel(__global double *BufferA, int scalar1, ...

You could either use pointers in the kernel parameter list, like so:

__kernel void sample_kernel(__global double *BufferA, global int *scalar1,

Or - and that is what I would suggest, since I could not find your version of kernel.setArg (...) in the C++ bindings specification but for some reason only in the implementation on khronos.org - directly copy the scalar like so:

Samplekernel.setArg(1, scalar1);

This also has the advantage that the variable is availabe in the private memory space of the kernel and not in the global space, as it would be when you specify a buffer as an argument.

The version of Kernel::setArg you are using might not copy the value but might only be used for host bound kernels, but I'm not shure of that.

Also, you may want to check the return value of setArg for errors.

Matthias Holzapfel
  • 492
  • 1
  • 4
  • 12
  • This isn't quite correct. You can pass a pointer to int to `setArg`, and if you do so, the parameter is passed as a scalar, not a pointer, *but* it must be declared in private memory space, such as `private int scalar1`. The `setArg` function doesn't magically distinguish between a pointer and a scalar, it just checks if the argument is an OpenCL memory object, and if it isn't, simply performs a block memory copy into private memory, leaving it the developer's responsibility to ensure that whatever is copied to the kernel can be understood by the kernel. At least my C++ bindings do that... – Thomas Apr 12 '13 at 13:36
  • From section **6.5.4** in the standard I understand, that the _private_ qualifier dies actually not make any difference, since everything without a qualifier is automatically _private_ – Matthias Holzapfel Apr 12 '13 at 15:11
  • Besides that, I now see that you're right. 'setArg(1, scalar1)' and 'setArg(1, sizeof(int), &scalar1)' resolve to the same cl-call. – Matthias Holzapfel Apr 12 '13 at 15:19
  • Ah, thanks for the clarification. I prefer to always explicitly provide memory spaces for kernel arguments, just a habit though. – Thomas Apr 13 '13 at 01:11