How does XGBoost's multi-category classification work?

Question

I am trying to understand how multiclass classification in XGBoost works. I have read the Paper by Chen and Guestrin (2016, https://arxiv.org/abs/1603.02754), but the details are still not clear to me:

Say I want to produce a probabilistic classifier for a 3-category classification task. If I understood correctly, XGBoost fits regression trees as "weak learners" or components of the boosting model. Therefore, if a new predictor vector is passed to the XGB model, the regression trees produce a real value as "prediction", the (weighted) combination of which is the boosted model prediciton.

From this question and the documentation in the paper, I gathered that a softmax activation function is applied to the boosted model prediction (a real value?), and that the tree structure (e.g. splitting points) are determined by optimizing the cross-entropy loss function after the softmax is applied to the model output.

What is not clear to me is how exactly three class probabilities are obtained. If the model output is just a real value (a weighted combination of the individual regression trees' outputs), how can an application of softmax function return 3 probabilities?

I am using the XGBoost library in both Python and R, but that probably makes no difference.

If you have general questions about how a particular statistical model works, you are likely better off asking at [stats.se] or [datascience.se]. — MrFlick, Jan 15 '20 at 20:18

score 4 · Accepted Answer · answered Jan 15 '20 at 20:22

There can be multiple child GBDT models inside a XGBoost model. Specifically, in case of multi-class classification, there is one child GBDT model for each class.

During prediction, the XGBoost model evaluates all child GBDT models and obtains a n-element array of raw boosting scores. The softmax normalization is applied to this array, yielding a n-element array of probability values.

You may export your XGBoost model into JSON or PMML data formats to inspect this machinery in more detail.

How does XGBoost's multi-category classification work?

1 Answers1