I'd like to ask what is the formula for the gain in XGBoost models for multi classification tasks. I know that for regression tasks it's calculated as SIMILARITY_LEFT_CHILD + SIMILARITY_RIGHT_CHILD - SIMILARITY_PARENT and that for binary classification tasks the gain is calculated as ENTROPY_PARENT - AVG.(ENTROPY_CHILDREN).
For tasks of multiple categories the confusion started when I found far less information, and worse - I encountered two different explanations. One explanation suggested using cross-entropy for a similar calculation for a binary classification: https://medium.datadriveninvestor.com/understanding-the-log-loss-function-of-xgboost-8842e99d975d And the other explanation has suggested using a Bayesian Information Criteria https://rpubs.com/mharris/multiclass_xgboost
Is one of the sources correct? Which one?