I want to predict the approx. number of open parking lots for a car park for a given time slot (hour of day / day of week etc.).
Using GradientBoostingRegressor
, results seem quite ok so far. However I'm wondering how I could weight/punish certain regression errors.
E.g. I want my model to be able to be pessimistic for predictions where the result should be (close to) zero. It would be ok to sacrifice some accuracy for that.
Consider the following plot of actual vs. predicted values on a test set:
Asked
Active
Viewed 125 times
0

Florian
- 271
- 3
- 14
-
1This question looks like it belongs on https://stats.stackexchange.com/ – gereleth Apr 19 '17 at 12:01
-
thanks, you're probably right. I'll check how I can move it there. Thought that maybe there's a way to do this with scikit-learn weighting params, thresholds or something like that. – Florian Apr 19 '17 at 12:12
-
Yes, it can be a case. But you can get the actual ideas from stats.exchange. Then you can search if scikit has something similiar, or try implementing you own – Vivek Kumar Apr 19 '17 at 12:24
-
Probably you shouldn't do this explicitly with a regressor of this type. Why not change some hyper parameters of the classifier? For example max_depth, number of estimators. It seems that gradient boosting can retrain. Is number of estimators best for your case? You can use, for example, cross validation to obtain the best number. – sergzach Apr 19 '17 at 16:25
-
[GridSearchCV()](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) also could help you. Pay your attention to result fields `best_estimator_` and `best_params_`. – sergzach Apr 19 '17 at 16:35