24

enter image description here

I am guessing that it is conditional probability given that the above (tree branch) condition exists. However, I am not clear on it.

If you want to read more about the data used or how do we get this diagram then go to : http://machinelearningmastery.com/visualize-gradient-boosting-decision-trees-xgboost-python/

dsl1990
  • 1,157
  • 5
  • 13
  • 25

4 Answers4

26

For a classification tree with 2 classes {0,1}, the value of the leaf node represent the raw score for class 1. It can be converted to a probability score by using the logistic function. The calculation below use the left most leaf as an example.

1/(1+np.exp(-1*0.167528))=0.5417843204057448

What this means is if a data point ends up being distributed to this leaf, the probability of this data point being class 1 is 0.5417843204057448.

Allen Qin
  • 19,507
  • 8
  • 51
  • 67
  • Could you share how you know this? or can you give some citation? tks – Pengju Zhao May 15 '19 at 08:46
  • When the objective is "reg:linear", what does the leaf value mean ? I see negative values and 0s – bicepjai May 17 '19 at 00:00
  • can you interpret same for multiclass classification ? what that leaf value in represent ? – Ganesh Kharad Jan 04 '21 at 04:48
  • 1
    @Allen, in your example, if the probability of the data point being 1 is 0.54 what is the actual prediction? how do we map the probability value to the class? less than 0.5 is 0 and more than 0.5 is 1? is this the way it works? – SNicolaou Mar 21 '21 at 11:22
  • @SNicolaou what probability threshold to use for deciding class 1 can be figured out during training. For different set of threshold values model can be evaluated using AUC score. – Rakesh K Mar 30 '22 at 10:20
9

If it is a regression model (objective can be reg:squarederror), then the leaf value is the prediction of that tree for the given data point. The leaf value can be negative based on your target variable. The final prediction for that data point will be sum of leaf values in all the trees for that point.

If it is a classification model (objective can be binary:logistic), then the leaf value is representative (like raw score) for the probability of the data point belonging to the positive class. The final probability prediction is obtained by taking sum of leaf values (raw scores) in all the trees and then transforming it between 0 and 1 using a sigmoid function. The leaf value (raw score) can be negative, the value 0 actually represents probability being 1/2.

Please find more details about the parameters and outputs at - https://xgboost.readthedocs.io/en/latest/parameter.html

sameershah141
  • 338
  • 4
  • 7
  • What does the leaf value mean in multiclass classification models (multi:softprob)? – Nick Fankhauser Nov 13 '20 at 23:25
  • @NickFankhauser, in case of (multi:softprob), the leaf will contain a probability for each class. The predict function also will return nrows*nclass vector where nclass is number of classes. – sameershah141 Nov 17 '20 at 05:36
  • You can find the details here - https://xgboost.readthedocs.io/en/latest/parameter.html#general-parameters – sameershah141 Nov 17 '20 at 05:37
  • @sameershah141, in the case of a classification model, how do we map the final probability prediction value to the actual predicted value e.g. 0 or 1 if the available classes are {0, 1} – SNicolaou Mar 21 '21 at 10:25
7

Attribute leaf is the predicted value. In other words, if the evaluation of a tree model ends at that terminal node (aka leaf node), then this is the value that is returned.

In pseudocode (the left-most branch of your tree model):

if(f1 < 127.5){
  if(f7 < 28.5){
    if(f5 < 45.4){
      return 0.167528f;
    } else {
      return 0.05f;
    }
  }
}
user1808924
  • 4,563
  • 2
  • 17
  • 20
2

You are correct. Those probability values associated with leaf nodes are representing the conditional probability of reaching leaf nodes given a specific branch of the tree. Branches of trees can be presented as a set of rules. For example, @user1808924 mentioned in his answer; one rule which is representing the left-most branch of your tree model.

So, in short: The tree can be linearized into decision rules, where the outcome is the contents of the leaf node, and the conditions along the path form a conjunction in the if clause. In general, the rules have the form:

if condition1 and condition2 and condition3 then outcome.

Decision rules can be generated by constructing association rules with the target variable on the right. They can also denote temporal or causal relations.

Community
  • 1
  • 1
Wasi Ahmad
  • 35,739
  • 32
  • 114
  • 161