0

I am preprocessing data via Pipelines, turning categoricals to numeric, encoding etc... and it's very comfortable.

But there is instance later in the project, where I want to test out some feature importance and I need to give X and y to the model. But it does not accept pipeline, hence X and y are not preprocessed.

from yellowbrick.model_selection import FeatureImportances

model = RandomForestClassifier(n_estimators=10)
viz = FeatureImportances(model)
viz.fit(X, y)
viz.show()

Is there a way to use pipelines preprocessed data like X,y to input into models? Or should I preprocess and encode data manually for such cases? Thanks

ValdemarT
  • 77
  • 5

1 Answers1

0

Managed to find a solution.

Basically you can access models which have feature importance, via pipeline .steps[1][1]

Though it seems to be more easy to use this Yellowbrick feature importance with Ordinal/Label encoder, than with One hot encoder.

ValdemarT
  • 77
  • 5