2

I am building a kernel code that captures the triangles inside my current GL scene.
For that I send my vertex streams along with my indices to the kernel code.

Here's the declaration for my kernel entry:

__kernel
void CaptureTriangles(
    const uint NumTriangles,
    const float16 WorldMatrix,
    __constant ushort3 *IndexDataBlock,
    __constant struct Vertex *DataBlock,
    __global struct Triangle *TriangleBuffer,
    __global uint *TriangleBufferCount)

The Vertex structure is defined as such:

struct Vertex
{
    float3  position;
    float3  normal;
    float   materialIndex;
}
__attribute__((packed));

Now this stream was created through GL and that's how the data is laid out.

When fetching the triangles, I do the following in kernel code:

const ushort3 idx = IndexDataBlock[get_global_id(0)];
const struct Vertex v0 = DataBlock[idx.x],
                    v1 = DataBlock[idx.y],
                    v2 = DataBlock[idx.z];

But it seems like OpenCL keeps on re-aligning the Vertex struct to its own internal requirements even though it is declared as __attribute((packed)). So the triangles are never captured properly.

Switching from __constant struct Vertex *DataBlock to __constant float *DataBlock and fetching each float explicitly in the kernel code fixes the issue. So this works when reading float by float:

// __constant float *DataBlock
float4 p0 = (float4)(DataBlock[7 * idx.x + 0], DataBlock[7 * idx.x + 1], DataBlock[7 * idx.x + 2], 1.0f),
       p1 = (float4)(DataBlock[7 * idx.y + 0], DataBlock[7 * idx.y + 1], DataBlock[7 * idx.y + 2], 1.0f),
       p2 = (float4)(DataBlock[7 * idx.z + 0], DataBlock[7 * idx.z + 1], DataBlock[7 * idx.z + 2], 1.0f);

I'd rather use the struct Vertex syntax for code clarity, is there any way to get OpenCL to not re-align structs?

user703016
  • 37,307
  • 8
  • 87
  • 112
  • Cool, yes I've seen that in other places. So a float3 is always aligned to a float4 then, disregarding any packed attribute... I guess what I was after was a way to tell OpenCL how to interpret some data that's already laid out in memory. Sounds like that's not possible, thanks for the incredibly quick response! Cheers. – Guillaume Boissé Nov 10 '15 at 11:02

1 Answers1

3

For CL, cl_float3 and cl_float4 are equal in size. But in your case your GL code gives real float3 values as output.

__attribute__((packed)) will not fix your problem, because for CL the struct is already packed, just with different element sizes inside.

You will have to manually parse it I'm afraid.

DarkZeros
  • 8,235
  • 1
  • 26
  • 36