I know how to use predict_proba() and the meaning of the output. Can anyone tell me how predict_proba() internally calculates the probability for decision tree?
2 Answers
Here is the official source code for sklearn.tree.DecisionTreeClassifier's predict_proba method, which I found from the official scikit-learn documentation (https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html), and clicking [source] for the predict_proba_ method: https://github.com/scikit-learn/scikit-learn/blob/98cf537f5/sklearn/tree/_classes.py#L897. I have also included a snippet of the source code for predict_proba below:
def predict_proba(self, X, check_input=True):
"""Predict class probabilities of the input samples X.
The predicted class probability is the fraction of samples of the same
class in a leaf.
Parameters
----------
X : {array-like, sparse matrix} of shape (n_samples, n_features)
The input samples. Internally, it will be converted to
``dtype=np.float32`` and if a sparse matrix is provided
to a sparse ``csr_matrix``.
check_input : bool, default=True
Allow to bypass several input checking.
Don't use this parameter unless you know what you're doing.
Returns
-------
proba : ndarray of shape (n_samples, n_classes) or list of n_outputs \
such arrays if n_outputs > 1
The class probabilities of the input samples. The order of the
classes corresponds to that in the attribute :term:`classes_`.
"""
check_is_fitted(self)
X = self._validate_X_predict(X, check_input)
proba = self.tree_.predict(X)
if self.n_outputs_ == 1:
proba = proba[:, : self.n_classes_]
normalizer = proba.sum(axis=1)[:, np.newaxis]
normalizer[normalizer == 0.0] = 1.0
proba /= normalizer
return proba
else:
all_proba = []
for k in range(self.n_outputs_):
proba_k = proba[:, k, : self.n_classes_[k]]
normalizer = proba_k.sum(axis=1)[:, np.newaxis]
normalizer[normalizer == 0.0] = 1.0
proba_k /= normalizer
all_proba.append(proba_k)
return all_proba

- 9
- 3
-
Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 29 '22 at 08:00
-
I've edited my original answer to include more information about how I found this code! – awesomecosmos Dec 29 '22 at 22:14
First You have to see this for basics of decision tree https://www.youtube.com/watch?v=_L39rN6gz7Y and after that here is the link :https://www.youtube.com/watch?v=wpNl-JwwplA to see how these probabilities are calculated.
Here for predict_proba() function just finds out the probability of occurrence of all the all the classes (and predict() uses the class that have maximum probability from the predict_proba() )

- 84
- 6