17

I have a pipeline in scikit-learn that uses a custom transformer I define like below:

class MyPipelineTransformer(TransformerMixin):

which defines functions

__init__, fit() and transform()

However, when I use the pipeline inside RandomizedSearchCV, I get the following error:

'MyPipelineTransformer' object has no attribute 'get_params'

I've read online (e.g. links below)

(Python - sklearn) How to pass parameters to the customize ModelTransformer class by gridsearchcv

http://scikit-learn.org/stable/auto_examples/hetero_feature_union.html

that I could get 'get_params' by inheriting from BaseEstimator, instead of my current code inheriting just from TransformerMixin. But my transformer is not an estimator. Is there any downside to having a non-estimator inherit from BaseEstimator? Or is that the recommended way to get get_params for any transformer (estimator or not) in a pipeline?

Community
  • 1
  • 1
Max Power
  • 8,265
  • 13
  • 50
  • 91

1 Answers1

10

Yes it looks like this is the standard way of achieving this. For example in the source for sklearn.preprocessing we have

class FunctionTransformer(BaseEstimator, TransformerMixin)
maxymoo
  • 35,286
  • 11
  • 92
  • 119
  • Note also that `__init__` needs to be set up properly: https://stackoverflow.com/a/68926629/14278409 – njp Nov 12 '22 at 22:36