Context of my problem:
I'm performing hyperparameter tuning using GridSearchCV from scikit-learn in mt random forest regressor. To alleviate overfitting, I found that maybe I should use the pruning technique. I checked in the docs and I found ccp_alpha
parameter that refers to pruning; and I also found this example that tells about pruning in the decision tree.
My question:
Since I'm looking for the best parameters of the random forest (GRidSeachCV), how should I input the ccp_alpha
value? Should I include before or after the GridSearchCV? Considering that every time that I perform GridSearchCV the structure of the model changes... Are you guys have some reference? articles?
My point of view:
For me makes more sense to perform hyperparameter tuning first and then add the ccp_alpha
(pruning) before train and test this "best model", but I'm not sure....