Trying to understand how eli5 and XGBoost can be used to interpret results.
My question is how the score that eli5 returns can be calculated.
From the documentation of eli5 (https://eli5.readthedocs.io/en/latest/autodocs/xgboost.htm):
- Feature weights are calculated by following decision paths in trees of an ensemble.
- Each leaf has an output score, and expected scores can also be assigned to parent nodes.
- Contribution of one feature on the decision path is how much expected score changes from parent to child.
- Weights of all features sum to the output score of the estimator.
And the documentation refers to this link: http://blog.datadive.net/interpreting-random-forests/
Here it states: "the prediction is simply the average of the bias terms plus the average contribution of each feature"
I want to understand how the score of eli5 is calculated. Below I use dump_model to get the gains values for each node and leaf in a very simple XGBClassifier with max_depth=2 and n_estimators=2 and only one predictor X1.
booster=model.get_booster()
booster.dump_model('dump.raw.txt', with_stats = True)
Output: (from txt)
booster[0]:
0:[X1<6.5] yes=1,no=2,missing=1,gain=3408.03906,cover=38057.75
1:[X1<2.5] yes=3,no=4,missing=3,gain=490.453125,cover=37151
3:leaf=-0.579237163,cover=34979.75
4:leaf=-0.431787312,cover=2171.25
2:[X1<13.5] yes=5,no=6,missing=5,gain=216.349213,cover=906.75
5:leaf=-0.100638106,cover=547.5
6:leaf=0.198612079,cover=359.25
booster[1]:
0:[X1<4.5] yes=1,no=2,missing=1,gain=2091.49219,cover=35177.2461
1:[X1<1.5] yes=3,no=4,missing=3,gain=154.304688,cover=33770.3086
3:leaf=-0.448578328,cover=29916.4414
4:leaf=-0.384361565,cover=3853.86646
2:[X1<10.5] yes=5,no=6,missing=5,gain=281.797821,cover=1406.93604
5:leaf=-0.167458653,cover=907.320374
6:leaf=0.112893201,cover=499.615723
If I run show_prediction on eli5 for one observation I get the following contributions and score.
Output:
y=0 (probability 0.567, score -0.268) top features
Contribution? Feature Value
+0.983 <BIAS> 1.000
-0.715 X1 10.000
The score is -0.268. My question is if this can be calculated by hand using the gain from each node? I been trying to implement the ideas of decision paths from the link above, but I do not end up with the score eli5 returned.
When X1=10 leaf number 5 is the output score for both trees.