5

I am training linear regression model using a data-set which has real valued labels in the interval [0,10]. My predicted values on the test set have some predictions exceeding 10. Is there a way to cap the predictions to 10.

I am thinking of doing a conditional check such that if a prediction exceeds 10, I explicitly set it to 10.

Is there a better way?

atlantis
  • 3,056
  • 9
  • 30
  • 41
  • This question is pretty vague. Unless you get a bit more specific, I don't see how anyone can give you a "better way". – Joel Cornett Mar 17 '12 at 22:15
  • By better I only mean better than writing an explicit if (value > 10) value = 10 kind of a statement that is executed for every value the regression model emits. This seems like a fairly usual scenario so I am hoping there is a standard way to do this. Does this make it clearer? I will be glad to edit whatever is making the question vague – atlantis Mar 17 '12 at 22:27
  • If I understand this correctly, wouldn't it be better to check the range of your linear function and stop calculation of values outside the corresponding domain? – Joel Cornett Mar 17 '12 at 22:38
  • I think the question is precise enough – Vladtn Mar 18 '12 at 11:50

1 Answers1

8

If y is the output of the regression object's predict method, then you can Numpy's minimum to cap it to 10:

y = np.minimum(y, 10.)

To also cap it below at zero, do

y = np.maximum(np.minimum(y, 10.), 0.)

or, shorter:

y = np.clip(y, 0., 10.)
Fred Foo
  • 355,277
  • 75
  • 744
  • 836