I have been having a strange error in Cuda with integer division, using the long long data type. Here's a condensed version of the code.
__global__ void Test(bool * d_test_list){
long long index = threadIdx.x + blockIdx.x*blockDim.x;
bool test = false;
if (index / 25 == 5) //Somehow not true when index == 125?
{
test = true;
}
d_test_list[index] = test;
}
After printing out all the elemtents of d_test_list, 125 does not show up, as well as any number in the range of [125,149] that should work. My only guess is that this has something to do with how Cuda handles integer types. A similar thing happens with the modulus, incorrect results, but (+, -, and *) all work great. I am using 1024 threads/Block, would that be an issue?
I am using Cuda v6.5 RC, but I'd assume they'd have integer division figured out by now.