0

I encountered a problem when I tried to calculate the mean of an array in two ways. Below is the code:

float sum1, sum2, tmp, mean1, mean2;
double sum1_double, sum2_double, tmp_double;
int i, j;
int Nt=29040000;  //array size
int piecesize=32;
int Npiece=Nt/piecesize;
float* img;
float* d_img;
double* img_double;
img_double = (double*)calloc(Nt, sizeof(double));
cudaHostAlloc((void**)&img, sizeof(float)*Nt, cudaHostAllocDefault);
cudaMalloc((void**)&d_img, sizeof(float)*Nt);
...
//Some calculation is done in GPU and the results are stored in d_img;
...    
cudaMemcpy(img, d_img, Nt*sizeof(float), cudaMemcpyDeviceToHost);
for (i=0;i<Nt;i++) img_double[i]=(double)img[i];

//Method 1
sum1=0;
for (i=0;i<Nt;i++) 
{ sum1 += img[i]; }

sum1_double=0;
for (i=0;i<Nt;i++) 
{ sum1_double += img_double[i]; }

//Method 2
sum2=0;
for (i=0;i<Npiece;i++)
{   tmp=0; 
      for (j=0;j<piecesize;j++)
        { tmp += img[i*piecesize+j];}
    sum2 += tmp;
}

sum2_double=0;
for (i=0;i<Npiece;i++)
{   tmp_double=0; 
      for (j=0;j<piecesize;j++)
        { tmp_double += img_double[i*piecesize+j];}
    sum2_double += tmp_double;
}

mean1=sum1/(float)Nt;
mean2=sum2/(float)Nt;
mean1_double=sum1_double/(double)Nt;
mean2_double=sum2_double/(double)Nt;

cout<<setprecision(15)<<mean1<<endl;
cout<<setprecision(15)<<mean2<<endl;
cout<<setprecision(15)<<mean1_double<<endl;
cout<<setprecision(15)<<mean2_double<<endl;

Output:

132.221862792969
129.565872192383
129.565938340543
129.565938340543

The results obtained from the two methods, mean1=129.6, mean2=132.2, are significantly different. May I know why?

Thanks a lot in advance!

Jian Gao
  • 11
  • 3

1 Answers1

4

The reason is that floating point arithmetic is not precise. When you accumulate integers, float becomes imprecise when abs(value) becomes larger than 224 (I'm supposing IEEE-754 32-bit here). For example, float is incapable to store 16777217 precisely (it will become 16777216 or 16777218, depending on the rounding mode).

Supposedly your second calculation is the more precise one, as less precision is lost, because of the separate tmp accumulation.

Change your sum1, sum2, tmp variables to long long int, and hopefully you'll get the same result for both calculations.

Note: I've supposed that your img stores integer data. If it stores floats, then there is no easy way to fix this perfectly. One way is to use double instead of float for sum1, sum2 and tmp. The difference will be there, but it will be much smaller. And there are techniques how to accumuluate floats more precisely than simple summing. Like Kahan Summation.

geza
  • 28,403
  • 6
  • 61
  • 135
  • Thanks for your answer. The data in img is `float` type. How can I change it to `long long int`. – Jian Gao Nov 04 '18 at 14:23
  • You probably want double then – drescherjm Nov 04 '18 at 14:27
  • @geza Thanks for your information. I have tried using `double` type. As you can see in the updated post, with `double` the results obtained from method 1 and method 2 are the same. – Jian Gao Nov 04 '18 at 15:33
  • @drescherjm I have updated my post using double. The results seem to be better. Thanks. – Jian Gao Nov 04 '18 at 15:34
  • @JianGao: In the general case, they won't be the same. Even, they may not be exactly the same currently (print the numbers with more precision, like 20 dignificant digits). But maybe the difference is so small, so it's not a problem for you. – geza Nov 04 '18 at 15:44
  • @geza you are right. I have printed more precision. Yes, for my application, method 2 with `float` data type is enough. – Jian Gao Nov 04 '18 at 16:14