I am trying to fit a Linear model using LinearRegression from scikit. From the predict function, I get a point estimate prediction, but I need a distribution of the possible value with probably the point value from predict being the mean of a Gaussian. I would like to know if there is a way to get such a distribution from any of the scikit models. I checked the variance score, but could not figure out a way to map it to the variance. Please help.
Asked
Active
Viewed 1,314 times
1 Answers
0
If the data you're fitting is in fact from a linear-Gaussian process and the sample set you used to fit is large enough and corrupted by Gaussian noise, then you can get the distribution for the predictions from the R^2 coefficient returned by score() method of the linear regression object. R^2 is 1 - (variance of prediction error) / (variance of y). So the variance of the predicted points is:
var(pred) = (1 - R^2) * var(y)

Keith Brodie
- 657
- 3
- 17