Capping linear regression prediction values using scikit

Question

I am training linear regression model using a data-set which has real valued labels in the interval [0,10]. My predicted values on the test set have some predictions exceeding 10. Is there a way to cap the predictions to 10.

I am thinking of doing a conditional check such that if a prediction exceeds 10, I explicitly set it to 10.

Is there a better way?

This question is pretty vague. Unless you get a bit more specific, I don't see how anyone can give you a "better way". — Joel Cornett, Mar 17 '12 at 22:15
By better I only mean better than writing an explicit if (value > 10) value = 10 kind of a statement that is executed for every value the regression model emits. This seems like a fairly usual scenario so I am hoping there is a standard way to do this. Does this make it clearer? I will be glad to edit whatever is making the question vague — atlantis, Mar 17 '12 at 22:27
If I understand this correctly, wouldn't it be better to check the range of your linear function and stop calculation of values outside the corresponding domain? — Joel Cornett, Mar 17 '12 at 22:38

Fred Foo · Accepted Answer · 2012-06-24T14:10:49.563

8

If y is the output of the regression object's predict method, then you can Numpy's minimum to cap it to 10:

y = np.minimum(y, 10.)

To also cap it below at zero, do

y = np.maximum(np.minimum(y, 10.), 0.)

or, shorter:

y = np.clip(y, 0., 10.)

edited Jun 24 '12 at 14:10

answered Mar 18 '12 at 21:14

Fred Foo

355,277
75
744
836

Capping linear regression prediction values using scikit

1 Answers1