2

I am trying to explore the way to use structure in opencl

I first try struct (defined on host)

typedef struct UserStruct {

    cl_int x;
    cl_int y;
    cl_int z;
    cl_int w;
} UserStruct;

and structure (defined on device)

typedef struct UserStruct {
    int x;
    int y;
    int z;
    int w;
} UserStruct;

Using the defined structure, I create two buffers (para_input and para_output) and init them by different values. The kernel function copies value from para_input to para_output.

The example works fine.

But, when I add cl_int16 in the struct, the copying kernel does not work. here is the modified structure:

typedef struct UserStruct {

    cl_int x;
    cl_int y;
    cl_int z;
    cl_int w; 

    cl_int16 vn16;
} UserStruct;

and structure (defined on device)

typedef struct UserStruct {
    int x;
    int y;
    int z;
    int w;

    int16 vn16;
} UserStruct;

Is there requirement to align the structure on both host and device? Or what is the most popular way to use structure in opencl? Thanks.

Erkang
  • 31
  • 3
  • Are `sizeof()` of both structures equal? Also, a runnable example would help (or, at least, an explanation of "the copying kernel does not work"). – fjarri Feb 03 '16 at 02:28
  • Thanks for the reply. You are right. The sizes are different between host and device. The size is 80 on host. But the size is 128 on device. Could you give some hints to handle this issue? I didn't get the ideas of alignments in the specification. Thanks. – Erkang Feb 03 '16 at 03:17

2 Answers2

2

Expanding on the comment:

It seems that your problem is caused by the difference in the default structure alignment in your C compiler and OpenCL compiler. Namely, the C compiler packs the structure to the minimum of 80 bytes, while the OpenCL compiler aligns it to 128 bytes (which is a good thing to do performance-wise). You can match the alignment by specifying it explicitly: either pack both structures, or align both to 128 bytes. See OpenCL docs and your compiler's docs (which, most probably, uses the same notation) for details.

In any case, I would recommend going with the 128 bytes alignment, unless you are pressured for space. Declare your structures as:

typedef struct UserStruct {

    cl_int x;
    cl_int y;
    cl_int z;
    cl_int w;

    cl_int16 vn16;
} __attribute__ ((aligned (128))) UserStruct;

and analogously for the host one.

As a side note, nothing prevents you from using the same structure both for the host and the device code. cl_ints are just aliases for native types anyway (although the explicit alignment specifier will be still necessary, because the structure will potentially be processed by different compilers).

fjarri
  • 9,546
  • 39
  • 49
1

On windows machine or vs C++ complier, please try the following lines to align the struct, __attribute__ works on GNC compiler.

typedef __declspec(align(128)) struct UserStruct {

    cl_int x;
    cl_int y;
    cl_int z;
    cl_int w;

    cl_int16 vn16;
} UserStruct;
Erkang
  • 31
  • 3