-2

I have the following function to calculate coefficients:

void CalculateCoefficients(LinearFit *DataSet, double *A, double *B)
  {
   /* Declare and initialize sum variables */
   double S_XX = 0.0;
   double S_XY = 0.0;
   double S_X  = 0.0;
   double S_Y  = 0.0;
   int lcv;

   /* Compute the sums */
   for (lcv=0; lcv < DataSet->NextElement; lcv++)
    {
      S_XX += DataSet->Data_X[lcv] * DataSet->Data_X[lcv];
      S_XY += DataSet->Data_X[lcv] * DataSet->Data_Y[lcv];
      S_X  += DataSet->Data_X[lcv];
      S_Y  += DataSet->Data_Y[lcv];
    } /* for() */

   /* Compute the parameters of the line Y = A*X + B */
   (*A) = (((DataSet->NextElement * S_XY) - (S_X * S_Y)) / ((DataSet->NextElement * S_XX) - (S_X * S_X)));
   (*B) = (((S_XX * S_Y) - (S_XY * S_X)) / ((DataSet->NextElement * S_XX) - (S_X * S_X)));
  } /* CalculateCoefficients() */

I am looking to optimize the loop. I tried loop unrolling but it didn't do much. What else can I do?

Sara Fuerst
  • 5,688
  • 8
  • 43
  • 86
  • You could get significant performance gain using SIMD instructions if your architecture supports it... if you don't mind computing parallel sums and then horizontally adding at the end. It does change the ordering of adds, but that's only significant in more extreme cases. If you have the luxury of choosing your architecture, you may be lucky enough to leverage some seriously wide vectorised instructions. – paddy Oct 04 '16 at 00:46

1 Answers1

1

You could try:

double dsdx, dsdy;
...
dsdx = DataSet->Data_X[lcv];
dsdy = DataSet->Data_y[lcv];
S_XX += dsdx * dsdx;
S_XY += dsdx * dsdy;
S_X  += dsdx;
S_Y  += dsdy;
...

This way you only get the values out of your struct once in each iteration of the loop.

ad absurdum
  • 19,498
  • 5
  • 37
  • 60