using best params from gridsearchcv

Question

I don't know if it is the right question to ask here, but I will ask anyways. If it is not allowed please do let me know.

I have used GridSearchCV to tune parameters to find best accuracy. This is what I have done:

from sklearn.grid_search import GridSearchCV
parameters = {'min_samples_split':np.arange(2, 80), 'max_depth': np.arange(2,10), 'criterion':['gini', 'entropy']}
clfr = DecisionTreeClassifier()
grid = GridSearchCV(clfr, parameters,scoring='accuracy', cv=8)
grid.fit(X_train,y_train)
print('The parameters combination that would give best accuracy is : ')
print(grid.best_params_)
print('The best accuracy achieved after parameter tuning via grid search is : ', grid.best_score_)

This gives me following result:

The parameters combination that would give best accuracy is : 
{'max_depth': 5, 'criterion': 'entropy', 'min_samples_split': 2}
The best accuracy achieved after parameter tuning via grid search is :  0.8147086914995224

Now, I want to use these parameters while calling a function that visualizes a decision tree

The function looks something like this

def visualize_decision_tree(decision_tree, feature, target):
    dot_data = export_graphviz(decision_tree, out_file=None, 
                         feature_names=feature,  
                         class_names=target,  
                         filled=True, rounded=True,  
                         special_characters=True)  
    graph = pydotplus.graph_from_dot_data(dot_data)  
    return Image(graph.create_png())

Right now I am trying to use the best parameters provided by GridSearchCV to call the function in the following way

dtBestScore = DecisionTreeClassifier(parameters = grid.best_params_)
dtBestScore = dtBestScore.fit(X=dfWithTrainFeatures, y= dfWithTestFeature)
visualize_decision_tree(dtBestScore, list(dfCopy.columns.delete(0).values), 'survived')

I am getting error in first line of code which says

TypeError: __init__() got an unexpected keyword argument 'parameters'

Is there some way I can somehow manage to use the best parameters provided by grid search and use it automatically? Rather than looking the result and manually setting value of each parameter?

Doesn't python kwargs work like `DecisionTreeClassifier(**grid.best_params)`? See https://pythontips.com/2013/08/04/args-and-kwargs-in-python-explained/ for more on kwargs. — Oliver Dain, Jan 05 '17 at 01:08
that worked amazingly. You can write it as answer and I can accept it. I am new to this thing and didnt know much thanks that helped a lot — Cybercop, Jan 05 '17 at 01:15

score 20 · Accepted Answer · edited Jan 05 '17 at 21:41

20

Try python kwargs:

DecisionTreeClassifier(**grid.best_params)

See http://pythontips.com/2013/08/04/args-and-kwargs-in-python-explaine‌d for more on kwargs.

edited Jan 05 '17 at 21:41

Barmar

741,623
53
500
612

answered Jan 05 '17 at 19:43

Oliver Dain

9,617
3
35
48

2

What's the best way to do this, if you've optimized a pipeline? The prefixing of the keys with "pipelinestep__" seems to hurt the mapping of the arguments? ?) – stats-hb Jun 11 '19 at 14:45
1

param_dict = {x.replace("pipelinestep__", ""): v for x, v in param_dict.items()} – T. Shiftlet May 26 '20 at 20:53

using best params from gridsearchcv

1 Answers1