I am using XGBoost's sklearn API with sklearn's RandomizedSearchCV()
to train a boosted tree model with cross validation. My problem is imbalanced, so I've supplied the scale_pos_weight
parameter to my XGBClassifier
. For simplicity, let's say that I'm doing cross validation with two folds (k = 2). At the end of this post, I've provided an example of the model I'm fitting.
How is accuracy (or any metric) calculated on the validation set? Is the accuracy weighted using the scale_pos_weights
argument given to XGBoost, or does sklearn calculate the unweighted accuracy?
import xgboost as xgb
from sklearn.model_selection import RandomForestClassifier
xgb_estimator = xgb.XGBClassifier(booster = "gbtree")
tune_grid = {"scale_pos_weight": [1, 10, 100], "max_depth": [1, 5, 10]} # simple hyperparameters as example.
xgb_seearch = RandomizedSearchCV(xgb_estimator, tune_grid, cv=2, n_iter = 10,
scoring = "accuracy", refit = True,
return_train_score = True)
results = xgb_search.fit(X, y)
results.cv_results # look at cross validation metrics