4

I have a simple pipeline like this

pl = Pipeline(steps=[("preprocessor", ColumnTransformer(
                      transformers=[
                     ('num', Pipeline(steps=[('StandardScaler', StandardScaler())]), selector(dtype_exclude="category")),
                     ('cat', Pipeline(steps=[('onehot', OneHotEncoder( sparse = False, handle_unknown='ignore' ))]), selector(dtype_include="category"))])),
               ('LR', LogisticRegression(max_iter = 1000, intercept_scaling = 1))])

I then call pl.fit on my training data, but when I try to check the onehot encoder to get variable names I keep getting an error message that it hasnt been fitted yet

pl.fit(X_train.drop(['ID'],axis = 1), y_train)
pl.named_steps['preprocessor'].transformers[1][1].named_steps['onehot'].get_feature_names()

>> This OneHotEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

And checking confirms it has not been fitted. What am I missing?

from sklearn.utils.validation import check_is_fitted

try:
    check_is_fitted(pl)
except:
    print('not fitted')

>> not fitted
Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
L Xandor
  • 1,659
  • 4
  • 24
  • 48
  • please provide minimal reproducible example: dataframe and all code (include imports, for example) – Roim Jul 22 '20 at 07:36
  • 4
    `Pipeline`s don't set any attributes at fit time, so they will always fail `check_is_fitted`. As for the rest, see https://stackoverflow.com/q/58704347/10495893 – Ben Reiniger Jul 22 '20 at 16:54
  • Does this answer your question? [Sklearn components in pipeline is not fitted even if the whole pipeline is?](https://stackoverflow.com/questions/58704347/sklearn-components-in-pipeline-is-not-fitted-even-if-the-whole-pipeline-is) – Ben Reiniger Sep 02 '20 at 14:06

0 Answers0