4

I would like to declare the alignment for a global device variable in CUDA. Specifically, I have a string declaration, like __device__ char str1 = "some pre-defined string"; In normal gcc, I can request alignment from the compiler as __device__ char str1 __attribute__ ((aligned (4))) = "some pre-defined string";

However, when I tried this on nvcc, the compiler ignores these requests. The reason I would like to do this is to copy these strings onto a buffer in my kernels, and copying words at a time is much faster than copying bytes at a time, though they require that the src string be aligned. Can anyone please tell me how to request alignment from the nvcc compiler?

fall3nm0nk
  • 45
  • 1
  • 5

2 Answers2

7

See section 5.3.2 "Size and Alignment Requirement" of the "CUDA C Programming Guide", which can be found here:

The alignment requirement is automatically fulfilled for the built-in types of char, short, int, long, longlong, float, double like float2 or float4.

For structs, the size and alignment requirements can be enforced by the compiler using the alignment specifiers __align__(8) or __align__(16).

Example usage:

struct __align__(8) { 
    float r; 
    float i;
} complex_num;
Robadob
  • 5,319
  • 2
  • 23
  • 32
njuffa
  • 23,970
  • 4
  • 78
  • 130
  • Thanks. That works. I checked the NVCC compiler guide looking for something mentioned there. I should have checked the C programming guide as well. – fall3nm0nk Nov 03 '12 at 07:16
1

Can you check if this works?

__device__ char __align__(4) str1 = "some pre-defined string";
Roger Dahl
  • 15,132
  • 8
  • 62
  • 82
  • Thanks for the quick response. This was my first time asking for help on StackOverflow, and I will definitely use it as a resource in the future. – fall3nm0nk Nov 03 '12 at 07:18