How to force OpenCL not to re-align structs?

Question

I am building a kernel code that captures the triangles inside my current GL scene.
For that I send my vertex streams along with my indices to the kernel code.

Here's the declaration for my kernel entry:

__kernel
void CaptureTriangles(
    const uint NumTriangles,
    const float16 WorldMatrix,
    __constant ushort3 *IndexDataBlock,
    __constant struct Vertex *DataBlock,
    __global struct Triangle *TriangleBuffer,
    __global uint *TriangleBufferCount)

The Vertex structure is defined as such:

struct Vertex
{
    float3  position;
    float3  normal;
    float   materialIndex;
}
__attribute__((packed));

Now this stream was created through GL and that's how the data is laid out.

When fetching the triangles, I do the following in kernel code:

const ushort3 idx = IndexDataBlock[get_global_id(0)];
const struct Vertex v0 = DataBlock[idx.x],
                    v1 = DataBlock[idx.y],
                    v2 = DataBlock[idx.z];

But it seems like OpenCL keeps on re-aligning the Vertex struct to its own internal requirements even though it is declared as __attribute((packed)). So the triangles are never captured properly.

Switching from __constant struct Vertex *DataBlock to __constant float *DataBlock and fetching each float explicitly in the kernel code fixes the issue. So this works when reading float by float:

// __constant float *DataBlock
float4 p0 = (float4)(DataBlock[7 * idx.x + 0], DataBlock[7 * idx.x + 1], DataBlock[7 * idx.x + 2], 1.0f),
       p1 = (float4)(DataBlock[7 * idx.y + 0], DataBlock[7 * idx.y + 1], DataBlock[7 * idx.y + 2], 1.0f),
       p2 = (float4)(DataBlock[7 * idx.z + 0], DataBlock[7 * idx.z + 1], DataBlock[7 * idx.z + 2], 1.0f);

I'd rather use the struct Vertex syntax for code clarity, is there any way to get OpenCL to not re-align structs?

Cool, yes I've seen that in other places. So a float3 is always aligned to a float4 then, disregarding any packed attribute... I guess what I was after was a way to tell OpenCL how to interpret some data that's already laid out in memory. Sounds like that's not possible, thanks for the incredibly quick response! Cheers. — Guillaume Boissé, Nov 10 '15 at 11:02

score 3 · Accepted Answer · answered Nov 10 '15 at 11:12

For CL, cl_float3 and cl_float4 are equal in size. But in your case your GL code gives real float3 values as output.

__attribute__((packed)) will not fix your problem, because for CL the struct is already packed, just with different element sizes inside.

You will have to manually parse it I'm afraid.

How to force OpenCL not to re-align structs?

1 Answers1