-1

I have the following:

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier as RFC

par = {"n_estimators":n_estimators,
"max_depth":max_depth,
"class_weight":weight}


scores = {"AUC":"roc_auc","score":my_score} #Scores metric


rfc=RFC()

grid_rfc=GridSearchCV(rfc,
param_grid=par,
cv=10,
scoring=scores,
iid=False,
refit="AUC")

grid_rfc.fit(x_train,y_train)

I can then get the best parameters with grid_rfc.best_param but the score which provided the best parameters, is not listed.

As far as I understand, the score is the one RFC tries to maximize, so I do not get, why it is not present in the best parameter.

EDIT:

It is not the scoring that the RF produces that I am missing, but which scoring-function was used to fit the tree that gave the best result (e.g "AUC" or "my_score" from the score dict)

CutePoison
  • 4,679
  • 5
  • 28
  • 63
  • 1
    Look at `best_score_` for the score of the `best_param_` [per docs](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) – Scott Boston Nov 11 '19 at 16:59
  • It is not the score per say that I need, but the scoring-function (i.e "AUC" or "my_score" as shown above in `score`) – CutePoison Nov 12 '19 at 07:54
  • Do you want to test each type of score to determine which score was graded highest? Or know, by default, which `GridSearchCV` uses? – artemis Nov 12 '19 at 13:40
  • So, when I print `best_params_` the `scoring` is not present in that list, which I find odd. In an imbalanced dataset, there will (probally) be a huge difference if you pass `scoring="roc_auc"` or `scoring="accuracy"` (since the former should give a better result). – CutePoison Nov 12 '19 at 13:43

1 Answers1

0

According to the GridSearchCV documentation, you can use best_score_ of the best_param_ to get the best score.

I cannot test this example, as your code is not complete, but implementation should look like below:

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier as RFC

par = {"n_estimators":n_estimators,
"max_depth":max_depth,
"class_weight":weight}


scores = {"AUC":"roc_auc","score":my_score} #Scores metric


rfc=RFC()

grid_rfc=GridSearchCV(rfc,
param_grid=par,
cv=10,
scoring=scores,
iid=False,
refit="AUC")

grid_rfc.fit(x_train,y_train)

# Print the best score
print(grid.best_score_)

Now you may notice it looks slightly different than true precision, so see this Stack Overflow post for more discussion on that.

artemis
  • 6,857
  • 11
  • 46
  • 99
  • It is not the score per say that I need, but the scoring-function (i.e "AUC" or "my_score" from the `score` dict in my code) – CutePoison Nov 12 '19 at 07:42
  • I am also a bit confused of `scoring` and `refit`: `scoring` are functions which are used to evaluate each fit, right? That is in my example I find optimal parameters when I use `roc_auc` and `my_score` and `refit="AUC"` is used as a score for those two fits? `refit` is the function we use on all fits to see which is better. That is our "main-scoring-function" ? – CutePoison Nov 12 '19 at 07:59