Best scoring metric is not present in best_param_ for GridSearchCV

Question

I have the following:

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier as RFC

par = {"n_estimators":n_estimators,
"max_depth":max_depth,
"class_weight":weight}


scores = {"AUC":"roc_auc","score":my_score} #Scores metric


rfc=RFC()

grid_rfc=GridSearchCV(rfc,
param_grid=par,
cv=10,
scoring=scores,
iid=False,
refit="AUC")

grid_rfc.fit(x_train,y_train)

I can then get the best parameters with grid_rfc.best_param but the score which provided the best parameters, is not listed.

As far as I understand, the score is the one RFC tries to maximize, so I do not get, why it is not present in the best parameter.

EDIT:

It is not the scoring that the RF produces that I am missing, but which scoring-function was used to fit the tree that gave the best result (e.g "AUC" or "my_score" from the score dict)

Look at `best_score_` for the score of the `best_param_` [per docs](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) — Scott Boston, Nov 11 '19 at 16:59
It is not the score per say that I need, but the scoring-function (i.e "AUC" or "my_score" as shown above in `score`) — CutePoison, Nov 12 '19 at 07:54
Do you want to test each type of score to determine which score was graded highest? Or know, by default, which `GridSearchCV` uses? — artemis, Nov 12 '19 at 13:40
So, when I print `best_params_` the `scoring` is not present in that list, which I find odd. In an imbalanced dataset, there will (probally) be a huge difference if you pass `scoring="roc_auc"` or `scoring="accuracy"` (since the former should give a better result). — CutePoison, Nov 12 '19 at 13:43

score 0 · Answer 1 · answered Nov 11 '19 at 17:43

According to the GridSearchCV documentation, you can use best_score_ of the best_param_ to get the best score.

I cannot test this example, as your code is not complete, but implementation should look like below:

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier as RFC

par = {"n_estimators":n_estimators,
"max_depth":max_depth,
"class_weight":weight}


scores = {"AUC":"roc_auc","score":my_score} #Scores metric


rfc=RFC()

grid_rfc=GridSearchCV(rfc,
param_grid=par,
cv=10,
scoring=scores,
iid=False,
refit="AUC")

grid_rfc.fit(x_train,y_train)

# Print the best score
print(grid.best_score_)

Now you may notice it looks slightly different than true precision, so see this Stack Overflow post for more discussion on that.

It is not the score per say that I need, but the scoring-function (i.e "AUC" or "my_score" from the `score` dict in my code) — CutePoison, Nov 12 '19 at 07:42
I am also a bit confused of `scoring` and `refit`: `scoring` are functions which are used to evaluate each fit, right? That is in my example I find optimal parameters when I use `roc_auc` and `my_score` and `refit="AUC"` is used as a score for those two fits? `refit` is the function we use on all fits to see which is better. That is our "main-scoring-function" ? — CutePoison, Nov 12 '19 at 07:59

Best scoring metric is not present in best_param_ for GridSearchCV

1 Answers1