3

I am using xgboost's feature pred_contribs in order to get kind of interpretability (shapley values) for each sample of my model.

booster.predict(test, pred_contribs=True)

It returns a vector of contribution of shape (number of sample) x (number of features). Contributions sum is equal to the margin score.

But, I would like to use probabilities instead of margin score, and for simplicity I would like to convert (with approximation) contributions in probabilities.

Is there a way to do that ?

Code example:

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import xgboost as xgb

X, y = make_classification()
X_train, X_test, y_train, y_test = train_test_split(X, y)

dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

param = {
    'max_depth': 2,
    'eta': 1,
    'silent': 1,
    'objective': 'binary:logistic',
    'eval_metric': 'auc'
}

booster = xgb.train(param, dtrain, 50)

probabilites = booster.predict(dtest)

margin_score = booster.predict(dtest, output_margin=True)

contributions = booster.predict(dtest, pred_contribs=True)
Thomas
  • 1,164
  • 13
  • 41

1 Answers1

0

I am not sure that it is the same question and answer, you might want to take a look at my answer at a similar question here.

Basically, you divide your vector of contributions by its sum, and multiply it by the predicted probability:

contributions = contributions / sum(contributions) * predicted_probability where predicted_probability is the probability for the class of interest.

Again, I am not 100% certain this is the correct way to do things, but in my use case it works OK.

StefanPopov
  • 11
  • 1
  • 3