I have a set of operations running in a loop.
for(int i = 0; i < row; i++)
{
sum += arr1[0] - arr2[0]
sum += arr1[0] - arr2[0]
sum += arr1[0] - arr2[0]
sum += arr1[0] - arr2[0]
arr1 += offset1;
arr2 += offset2;
}
Now I'm trying to vectorize the operations like this
for(int i = 0; i < row; i++)
{
convert_int4(vload4(0, arr1) - vload4(0, arr2));
arr1 += offset1;
arr2 += offset2;
}
But how do I accumulate the resulting vector in the scalar sum
without using a loop?
I'm using OpenCL 2.0.