2

I am trying to fit a Linear model using LinearRegression from scikit. From the predict function, I get a point estimate prediction, but I need a distribution of the possible value with probably the point value from predict being the mean of a Gaussian. I would like to know if there is a way to get such a distribution from any of the scikit models. I checked the variance score, but could not figure out a way to map it to the variance. Please help.

Fayaz Ahmed
  • 953
  • 1
  • 9
  • 23

1 Answers1

0

If the data you're fitting is in fact from a linear-Gaussian process and the sample set you used to fit is large enough and corrupted by Gaussian noise, then you can get the distribution for the predictions from the R^2 coefficient returned by score() method of the linear regression object. R^2 is 1 - (variance of prediction error) / (variance of y). So the variance of the predicted points is:

var(pred) = (1 - R^2) * var(y)
Keith Brodie
  • 657
  • 3
  • 17