I'm attempting to add a component that will do regression analysis on data in our application. I wrote a service that uses MathNet Numerics to perform a regression to compute coefficients for various characteristics. I'm testing it by passing it the same data as a regression analysis in Excel. The data I'm using is below in CSV format:
Accuracy,Age,Boundary Lines,Distance,Unit,County,Similarity,Size,Quality
0.894556528,0.301,1,1,1,1,0.753,0.76,0.625
0.870692849,0.234,1,0.737,1,1,0.598,0.804,0.625
1.044444444,0.49,1,1,1,0,0.714,0.933,0.833
1.04120514,0.165,1,1,1,1,0.224,0.268,0.833
1.157635468,0.179,1,1,1,1,0.224,0.268,0.8
1.080750408,1,1,1,1,1,0.826,0.536,0.833
1.100041511,1,1,1,1,1,0.569,0.5,0.833
0.932113964,1,1,1,1,1,0.78,0.25,0.625
0.966800438,0.267,1,1,1,1,0.916,0.408,0.625
0.941051136,0.212,1,0.713,1,1,0.755,0.667,0.625
0.932835821,0.429,1,1,1,1,1,0.999,1
1.689189189,0.12,1,1,1,1,1,0.941,1
1.199451679,0.292,1,1,1,1,1,0.6,1
0.7,0.188,1,0.981,1,1,1,1,0.333
0.795454545,1,1,1,1,1,1,0.364,0.5
"Accuracy" is the dependent value, and I'm calculating coefficients for the other columns using them as independent values. When I do a Regression in Excel, I get these results:
When I test my data in code, I get the same values except for the Boundary Lines and Unit coefficients, while those are 0 in Excel, in my code they are -0.581 and 0.876, respectively.
I'm doing the calculations by creating a Matrix of the ratings columns and a Vector of the Accuracy column and using DirectRegressionMethod.NormalEquations
.
Matrix<double> ratingsMatrix = DenseMatrix.OfRowVectors(ratingsVectors);
Vector<double> accuracyVector = DenseVector.OfEnumerable(accuracyRatings);
Vector<double> coefficientVector = MultipleRegression.DirectMethod(ratingsMatrix, accuracyVector, DirectRegressionMethod.NormalEquations);
I'm trying to figure out the reason for the difference. The results from Excel worked well for what I needed them for, so I'd like to be able to duplicate it. I suspect it could be from the fact that both of those columns are all 1s?