Optimization of C loop

Question

I have the following function to calculate coefficients:

void CalculateCoefficients(LinearFit *DataSet, double *A, double *B)
  {
   /* Declare and initialize sum variables */
   double S_XX = 0.0;
   double S_XY = 0.0;
   double S_X  = 0.0;
   double S_Y  = 0.0;
   int lcv;

   /* Compute the sums */
   for (lcv=0; lcv < DataSet->NextElement; lcv++)
    {
      S_XX += DataSet->Data_X[lcv] * DataSet->Data_X[lcv];
      S_XY += DataSet->Data_X[lcv] * DataSet->Data_Y[lcv];
      S_X  += DataSet->Data_X[lcv];
      S_Y  += DataSet->Data_Y[lcv];
    } /* for() */

   /* Compute the parameters of the line Y = A*X + B */
   (*A) = (((DataSet->NextElement * S_XY) - (S_X * S_Y)) / ((DataSet->NextElement * S_XX) - (S_X * S_X)));
   (*B) = (((S_XX * S_Y) - (S_XY * S_X)) / ((DataSet->NextElement * S_XX) - (S_X * S_X)));
  } /* CalculateCoefficients() */

I am looking to optimize the loop. I tried loop unrolling but it didn't do much. What else can I do?

You could get significant performance gain using SIMD instructions if your architecture supports it... if you don't mind computing parallel sums and then horizontally adding at the end. It does change the ordering of adds, but that's only significant in more extreme cases. If you have the luxury of choosing your architecture, you may be lucky enough to leverage some seriously wide vectorised instructions. — paddy, Oct 04 '16 at 00:46

ad absurdum · Accepted Answer · 2016-10-04T03:51:40.283

1

You could try:

double dsdx, dsdy;
...
dsdx = DataSet->Data_X[lcv];
dsdy = DataSet->Data_y[lcv];
S_XX += dsdx * dsdx;
S_XY += dsdx * dsdy;
S_X  += dsdx;
S_Y  += dsdy;
...

This way you only get the values out of your struct once in each iteration of the loop.

edited Oct 04 '16 at 03:51

answered Oct 03 '16 at 22:59

ad absurdum

19,498
5
37
60

Optimization of C loop

1 Answers1