I use a feature selection in combination with a pipeline in SciKit-Learn. As a feature selection strategy I use SelectKBest
.
The pipeline is created and executed like this:
select = SelectKBest(k=5)
clf = SVC(decision_function_shape='ovo')
parameters = dict(feature_selection__k=[1,2,3,4,5,6,7,8],
svc__C=[0.01, 0.1, 1],
svc__decision_function_shape=['ovo'])
steps = [('feature_selection', select),
('svc', clf)]
pipeline = sklearn.pipeline.Pipeline(steps)
cv = sklearn.grid_search.GridSearchCV(pipeline, param_grid=parameters)
cv.fit( features_training, labels_training )
I know that I can get the best-parameters afterwards via cv.best_params_
. However, this only tells me that a k=4
is optimal. But I would like to know which features are these? How can this be done?