Is it possible to use pipelines preprocessed X and y in later analysis without a pipeline?

Question

I am preprocessing data via Pipelines, turning categoricals to numeric, encoding etc... and it's very comfortable.

But there is instance later in the project, where I want to test out some feature importance and I need to give X and y to the model. But it does not accept pipeline, hence X and y are not preprocessed.

from yellowbrick.model_selection import FeatureImportances

model = RandomForestClassifier(n_estimators=10)
viz = FeatureImportances(model)
viz.fit(X, y)
viz.show()

Is there a way to use pipelines preprocessed data like X,y to input into models? Or should I preprocess and encode data manually for such cases? Thanks

score 0 · Answer 1 · answered Feb 06 '20 at 13:02

0

Managed to find a solution.

Basically you can access models which have feature importance, via pipeline .steps[1][1]

Though it seems to be more easy to use this Yellowbrick feature importance with Ordinal/Label encoder, than with One hot encoder.

answered Feb 06 '20 at 13:02

ValdemarT

77
5

Is it possible to use pipelines preprocessed X and y in later analysis without a pipeline?

1 Answers1