2

I am developing a basic ray tracer using OpenCL / OpenGL Interop. I am having some issues with a kernel which shared local memory shared within a workgroup.

Here is the kernel:

__kernel void ComputeDirectionalShadowTexture(
    write_only image2d_t shadowTexture,
    read_only image2d_t positionTexture,
    __constant float3* lightDirection, __constant float4* spheres,
    )
{
    __local bool* shadowReduce[2];
    __local size_t idX, idY, idZ, localID;
    idX = get_global_id(0);
    idY = get_global_id(1);
    idZ = get_global_id(2);

    localID = get_local_id(2);

    //...Read Textures
    //...Perform Computation

    //...Write results
    if(shadowReduce[localID])
        write_imagef(shadowTexture, threadCoord.xy, (float4)(1.0f, 0.0f, 0.0f, 1.0f));
}

When running this, it is as if the get_local_id() function is never returning 0 (or only returning 1).

I would expect the problem to be related to how I am invoking the kernel:

size_t numGlobal[3] =
{
    rBuffer->textureWidth,
    rBuffer->textureHeight,
    numSpheres
};
size_t numLocal[3] = { 1, 1, numSpheres};

cl_event execution;

//Execute kernel
clError = clEnqueueNDRangeKernel
(
    buffer->clQueue,
    members->directionalShadowKernel,
    3,
    NULL,
    &numGlobal,
    &numLocal,
    numBeforeExecution,
    completeBeforeExecution,
    &execution
);

Where numSpheres is a constant set to 2.

Any/all feedback is appreciated.

Mr. Nex
  • 233
  • 1
  • 12
  • 1
    The question title is good and the answer actually matches it. But, the question text is not useful because it is too complicated. The error should not be related to raytracing. It should be reproducible by a much shorter kernel. Your question would be more useful for further readers, if you reduce it to a much smaller example which produces the same error. – Martin Zabel Mar 15 '16 at 09:25
  • I agree. I will edit and fix this soon. I've been really quite busy the last few days but within the next 24 hours I'll go ahead and make those changes. Thank you for the constructive criticism. – Mr. Nex Mar 16 '16 at 22:13

1 Answers1

5

I made a rookie mistake in the above code, if anybody ever has this problem please make sure you are not assigning the result of get_local_id() to a __local access qualified variable as I do here:

localID = get_local_id(2);

Of course the local variable gets overwritten by each thread in the work group, because the local address space is shared across a work group.

So instead of declaring localID as:

__local size_t localID;

it should be declared as:

size_t localID;

Hope this helps somebody.

user703016
  • 37,307
  • 8
  • 87
  • 112
Mr. Nex
  • 233
  • 1
  • 12