-1

I want to use the BernoulliNB() classifier, and my data is not binarized. So I want to choose the best binarization threshold by GridsearchCV(). My code looks like:

from sklearn.pipeline import Pipeline
from sklearn.naive_bayes import BernoulliNB
from sklearn.preprocessing import Binarizer

pipeline = Pipeline([('binarizer', Binarizer()), ('classifier', BernoulliNB())])
params = {'estimator__binarizer__threshold': np.logspace(0, 5, 20)}

clf = GridSearchCV(pipeline, param_grid=params, cv=5, refit=True)
clf.fit(X_train,y_train)
clf.best_estimator_.score(X_test, y_test)

It gives me error:

ValueError: Check the list of available parameters with estimator.get_params().keys().

I don't know what's wrong.

KejBi
  • 9
  • 2
  • `estimator__binarizer__treshold` should be `estimator__binarizer__threshold`. Spelling of `"threshold"` is wrong. – Vivek Kumar Nov 28 '18 at 08:09
  • That was not a reason. It still gives me this error: ValueError: Invalid parameter estimator for estimator Pipeline(memory=None, steps=[('binarizer', Binarizer(copy=True, threshold=0.0)), ('classifier', BernoulliNB(alpha=1.0, binarize=0.0, class_prior=None, fit_prior=True))]). Check the list of available parameters with `estimator.get_params().keys()`. – KejBi Nov 28 '18 at 17:39
  • Please do not use the comments space for this purpose - *edit & update* your post instead! – desertnaut Nov 28 '18 at 17:59

1 Answers1

0

Yes, my bad. In the comment, I just spotted the spelling mistake of 'treshold' and in a hurry, did not give attention to estimator part.

For a pipeline, the parameters can be accessed by using the two parts:

  1. Name of the steps like binarizer or classifier here
  2. Actual param name for that particular name from step 1.

You dont need to append estimator to the above parts. So in your case, you will need to use the following:

params = {'binarizer__threshold': np.logspace(0, 5, 20)}

to access the 'threshold' param of the 'binarizer' step of pipeline.

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132