I trained my classifier using a pipeline:
param_tuning = {
'classifier__learning_rate': [0.01, 0.1],
'classifier__max_depth': [3, 5, 7, 10],
'classifier__min_child_weight': [1, 3, 5],
'classifier__subsample': [0.5, 0.7],
'classifier__n_estimators' : [100, 200, 500],
}
cat_pipe = Pipeline(
[
('selector', ColumnSelector(categorical_features)),
('encoder', ce.one_hot.OneHotEncoder())
]
)
num_pipe = Pipeline(
[
('selector', ColumnSelector(numeric_features)),
('scaler', StandardScaler())
]
)
preprocessor = FeatureUnion(
transformer_list=[
('cat', cat_pipe),
('num', num_pipe)
]
)
xgb_pipe = Pipeline(
steps=[
('preprocessor', preprocessor),
('classifier', xgb.XGBClassifier())
]
)
grid = GridSearchCV(xgb_pipe, param_tuning, cv=5, n_jobs=-1, scoring='accuracy')
xgb_model = grid.fit(X_train, y_train)
The training data have categorical data, so the transformed data shape is (x , 100 )
. After that, i try to explain model prediction on unseen data. Since i pass single unseen example directly to model, it preprocessed it in shape (x, 15)
(because single observation does not have all examples all categorical data).
eli5.show_prediction(xgb['classifier'], xgb['preprocessor'].fit_transform(df), columns = xgb['classifier'].get_booster().feature_names))
And i got
ValueError: Shape of passed values is (1, 15), indices imply (1, 100).
This occurs because model was trained on whole preprocessed dataset with shape (x, 100), but i pass to explainer single observation with shape (1,15). How do i correctly pass unseen single observation to explainer?