I am confused by the following Pipeline
weirdness. Suppose I define a pipeline thus:
pipe = Pipeline([
('transformer', ColumnTransformer([('sc', StandardScaler(), [0, 1])])),
('model', LinearRegression())
])
now define a dataframe thus:
df = pd.DataFrame(np.random.rand(10, 4))
Now, interestingly, pipe.fit(df[[0, 1, 2]], df[3])
works fine. However,
pipe.predict(df[[0, 1, 2]])
does not, while pipe.predict(df[0, 1])
does. This seems wrong (pipelines are supposed to do their magic on both fit
and predict
steps). Am I missing something?