0

i cannot seem to debug this error

from sklearn.pipeline import Pipeline, FeatureUnion
a = TextTransformer('description', max_features=50)
b = TextTransformer('features', max_features=10)
pipeline = Pipeline([
    ('feats', FeatureUnion([
        ('description',a ), # can pass in either a pipeline
        ('features',b ) # or a transformer
    ])),
    ('clf', LinearSVC())  # classifier
])
pipeline.fit(df, df['interest_level'])

TextTransformer class

class TextTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, column, max_features=5000):
        self.tfidfVectorizer = TfidfVectorizer(use_idf=False, stop_words='english',
                                               tokenizer=self._custom_tokenizer, analyzer='word',
                                               max_features=max_features)
        self._vectorizer = None
        self._column = column

I do not understand where is the multiple arguments for max_features when i already passed it 2 arguments.

Am i missing something here?

aceminer
  • 4,089
  • 9
  • 56
  • 104

0 Answers0