I would like to declare the alignment for a global device variable in CUDA. Specifically, I have a string declaration, like
__device__ char str1 = "some pre-defined string";
In normal gcc, I can request alignment from the compiler as
__device__ char str1 __attribute__ ((aligned (4))) = "some pre-defined string";
However, when I tried this on nvcc, the compiler ignores these requests. The reason I would like to do this is to copy these strings onto a buffer in my kernels, and copying words at a time is much faster than copying bytes at a time, though they require that the src string be aligned. Can anyone please tell me how to request alignment from the nvcc compiler?