I’m currently studying XGBoost, and I learned that information gain in XGBoost is computed like this:
XGBoost information gain
What I’m curious is that, previously about information gain, I learned that it is computed (entropy of parent node - sum of entropy of child nodes), which is the opposite of XGBoost since XGBoost IG is computed as (score of left node + score of right node - score of parent node). I don’t understand why.