I'm trying to use XGBoost, and optimize the eval_metric
as auc
(as described here).
This works fine when using the classifier directly, but fails when I'm trying to use it as a pipeline.
What is the correct way to pass a .fit
argument to the sklearn pipeline?
Example:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris
from xgboost import XGBClassifier
import xgboost
import sklearn
print('sklearn version: %s' % sklearn.__version__)
print('xgboost version: %s' % xgboost.__version__)
X, y = load_iris(return_X_y=True)
# Without using the pipeline:
xgb = XGBClassifier()
xgb.fit(X, y, eval_metric='auc') # works fine
# Making a pipeline with this classifier and a scaler:
pipe = Pipeline([('scaler', StandardScaler()), ('classifier', XGBClassifier())])
# using the pipeline, but not optimizing for 'auc':
pipe.fit(X, y) # works fine
# however this does not work (even after correcting the underscores):
pipe.fit(X, y, classifier__eval_metric='auc') # fails
The error:
TypeError: before_fit() got an unexpected keyword argument 'classifier__eval_metric'
Regarding the version of xgboost:
xgboost.__version__
shows 0.6
pip3 freeze | grep xgboost
shows xgboost==0.6a2
.