I'm trying to use GPU for some image processing. In my kernel function I catched "misalignment" exception as
The thread tried to read or write data that is misaligned on hardware that does not provide alignment. For example, 16-bit values must be aligned on 2-byte boundaries; 32-bit values on 4-byte boundaries, and so on.
I reduced the kernel code to loops only, but I still got this problem. My reduced kernel function:
__kernel void TestKernel(
global const uchar* iImage,
global uchar* oImage,
uint width,
uint heigth,
uchar dif,
float power)
{
uint y = get_global_id(0);
if (y >= heigth)
return;
for (uint x = 0; x< width; ++x){
for (uint i = 0; i < 5; ++i) {
uint sum = 0;
for (uint j = 0; j<5; ++j) {
sum += 3;
}
}
}
}
(program throws exception in the second loop)
I'm using the C++ wrapper to call my kernel
kernel.setArg(iArg++, iImage);
kernel.setArg(iArg++, oImage);
kernel.setArg(iArg++, header.GetVal(header.Width));
kernel.setArg(iArg++, header.GetVal(header.Height));
kernel.setArg(iArg++, (unsigned char)10);
kernel.setArg(iArg++, saturation);
queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(header.GetVal(header.Height)), cl::NDRange(128));
oImage
and iImage
are cl::Buffer
saturation
is float
header.GetVal()
returns int
I'm using Visual Studio 2015 with CodeXL plugin and run the program on AMD Spectre(Radion R7).
What can cause this problem?